Rich text markup is a user-friendly approach. I wonder if we can use a similar method to help SD understand spatial relationships in images. In simple terms, it involves annotating elements in 3D software to match points and generating a specific format image that includes annotations and spatial position transformed into image area masks. This would be somewhat similar to ControlNet segmentation, but I believe it should be much more user-friendly and capable of handling complex spatial relationship definitions.We can explain spatial relationships by laying out simple basic geometry, or make quick annotations on an existing model, as a material, perhaps a blender plugin is needed to output the image conditions
Rich text markup is a user-friendly approach. I wonder if we can use a similar method to help SD understand spatial relationships in images. In simple terms, it involves annotating elements in 3D software to match points and generating a specific format image that includes annotations and spatial position transformed into image area masks. This would be somewhat similar to ControlNet segmentation, but I believe it should be much more user-friendly and capable of handling complex spatial relationship definitions.We can explain spatial relationships by laying out simple basic geometry, or make quick annotations on an existing model, as a material, perhaps a blender plugin is needed to output the image conditions