Open jclarkk opened 1 year ago
Hello, thanks for your interests in our work. Currently our diffusion model only support 256x256 resolution, so 512 does not work. For the OOM problem, you should use 1gpu.yaml, the diffusion won't cost too much memory. Current inference code supports float32, I will change it to float16 later to save memory. For stage 2, I will write a document about the parameters to favor better reconstruction methods.
First of all, amazing work on this project!
Would it be possible to increase texture quality? I've tried increasing the img_wh under validation_dataset in the config to [512, 512] but I get:
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 32 but got size 64 for tensor number 1 in the list.
As for second phase, what relevant parameters in the 'neuralangelo-ortho-wmask.yaml' config would allow modification to mesh quality?
And finally, less relevant to the above, but I've tried running on multi-GPU and the seconds phase (instant-nsr-pl) works fine when reducing the iterations but when trying to run the diffusion phase with a similar config to 8gpu.yaml but with 4 instead (running on 4xNVIDIA L4) I get OOM. Any tips you can share on this?
Thanks a lot!
Hello! I met this problem, too.
I find that img_wh
is corresponding to the size of latents between unet and vae. So changing the sample_size
of unet in mvdiffusion-joint-ortho-6views.yaml into 64 will make their size the same and fix the bug. And setting crop_size
to 384 leads to a better image proportion. But I can only get inconsistent results, not as amazing as the result of 256x256.
Hope authors can provide a higher-resolution pretrained unet or even training code!
Hello, @flamehaze1115 that would very appreciated if you could share a reconstruction doc !! @jclarkk, I will try to answer with what I did, I still experiment it so maybe in the future would be better:
@fefespn Thanks a lot for your notes, I'll try to play around with this as well.
@fefespn Hi, so have you tried with your suggestions? And, do you have any further good suggestions? Thanks
So yea, what I am doing now is like this: for a basic 3D object we need geometry + texture. from the wonder3D I get the geometry, it's good enough for my needs. Then I use the second stage of dreamgaussian repositry/paper, they have a differentiable rendered for texture refinement. you need to make sure the 3D obj from wonder3d aligned with the renderer. so I optimize the Object texture using my multi-view images and that renderer.
So yea, what I am doing now is like this: for a basic 3D object we need geometry + texture. from the wonder3D I get the geometry, it's good enough for my needs. Then I use the second stage of dreamgaussian repositry/paper, they have a differentiable rendered for texture refinement. you need to make sure the 3D obj from wonder3d aligned with the renderer. so I optimize the Object texture using my multi-view images and that renderer.
@fefespn Hello! Can you share more about your modified second stage of dg? The different camera settings of these two methods are a little confusing.
First of all, amazing work on this project!
Would it be possible to increase texture quality? I've tried increasing the img_wh under validation_dataset in the config to [512, 512] but I get:
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 32 but got size 64 for tensor number 1 in the list.
As for second phase, what relevant parameters in the 'neuralangelo-ortho-wmask.yaml' config would allow modification to mesh quality?
And finally, less relevant to the above, but I've tried running on multi-GPU and the seconds phase (instant-nsr-pl) works fine when reducing the iterations but when trying to run the diffusion phase with a similar config to 8gpu.yaml but with 4 instead (running on 4xNVIDIA L4) I get OOM. Any tips you can share on this?
Thanks a lot!