MrForExample / ComfyUI-3D-Pack

An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)
MIT License
1.72k stars 165 forks source link

CRM individual camera angles #149

Open MrBretten opened 1 month ago

MrBretten commented 1 month ago

Not really an issue, but potentially an improvement (although probably not possible with ComfyUI in one workflow).

6 camera angles are required to make a "good" model in CRM. Something I've noticed is that all camera angles generated are done in one go, but some my be good while others are awful; and the awful ones can break a better 3D model being generated.

Now, there isn't much consistency between each seed. One seed might generate the perfect angle visually on the right side camera, some anomalies on the back camera, and an absolute monstrosity for the top camera. On another seed, the left and top cameras might be pretty good, but now the right camera is completely messed up.

So my question is this: is it possible to do one camera angle at a time (not all 6 for sake of time and efficiency)? The idea is that I would then re-run until I get a good result, save out the image, move on to the next camera angle, repeat until I have all 6 angles, then collect the images as a batch, and finally run through CRM. Hopefully this would produce a better 3d mesh as there is some human element involved as to what looks right or not.

I'm pretty verbose in python but I started using ComfyUI a couple days ago and only today finally got 3D pack running (literal nightmare) - so there's a bit of a learning curve to make something like it myself in Comfy. I imagine it'd be simple enough to break up the 'CRM Images MVDiffusion Model' so it generates only one angle instead of a multiview (or maybe generate 10 outputs of the same angle with varying seed). In a separate workflow, you could batch load the images and then continue on with the 'CRM CCMs MVDiffusion Model' stage.

Is this possible? And if so, how could I do it?

MrForExample commented 1 month ago

Is it possible to do one camera angle at a time? Yes, but not with CRM, because it is trained with fixed 6 views ad once and it cannot specify camera poeses, we can achieve more consistent result with fine-tuned video diffusion model

There is a way to enhance CRM result without fine-tune the model FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Or using Instruct pix2pix (ip2p) or SD model plus IP-Adapter to modify those few not too good view images directly

Let me know, if you any other questions, cheers :)