lllyasviel / ControlNet

Let us control diffusion models!
Apache License 2.0
29.94k stars 2.7k forks source link

[IDEA PROPOSAL] Using blender to output a 2D file + 3D metadata of each pixel #528

Open ca3gamedev opened 1 year ago

ca3gamedev commented 1 year ago

Hi, I'm not a developer, but I wonder if it could be possible to use blender, to generate using their render and camera system, not a .png file, but a 2D texture + metadata on each pixel, basically each pixel having information about the depth and stuff like what object in the 3D scene, belongs.

And then use said metadata + color information of each pixel as aditional input for a difussion model.

geroldmeisinger commented 1 year ago

yes, and I think this is actually a good idea. you can generate any 3D information the same way you would in fragment shaders (face normals, camera distance, luminance etc.) and store them in alpha channels. you can use multiple alpha channels to train control net, look here https://github.com/lllyasviel/ControlNet/issues/10 . (I think this issue is a better fit for discussion section)