Open sauradip opened 2 months ago
Also i am getting very blurry results, i feel i need to increase the number of points, can you suggest what parameters in your code base can help me get better results
Hi ,
Thanks for making the work public. I find that your code works only when there is global movements, when the movement is local then the control points dont optimize well.
For the input video below
fan.mp4 I am getting this output
fan_0.mp4 with the control poitns as this
fan_cpts_0.mp4 which means the local deformation is hard to model with sparse control points. Any idea how to solve this ?
Hi, I think there are two main reasons that cause the issues you mentioned. On the one hand, I believe the segmentation result in this example might be incorrect. It is likely that the entire fan has been segmented as a whole without removing the gaps between the blades. This would cause the gaps between the blades to gradually turn into white 3DGS during the optimization process. On the other hand, the performance of our method largely depends on the distillation source (Zero123 here), which might fail to produce reasonable novel views for instances like this. You can randomly select a frame of this video and check if Zero123 handles it well in novel view synthesis.
To achieve better results in this example, I suggest you first check if the mask area in the segmentation result is correct. You might need a more powerful segmentor to handle certain examples, as this directly affects the areas where 3DGS and control points exist. Additionally, changing the distillation source might yield better results. However, I've previously tried multi-view generation networks like ImageDream, but their performance on the Consistent4D benchmark is not as good as Zero123xl.
Also i am getting very blurry results, i feel i need to increase the number of points, can you suggest what parameters in your code base can help me get better results
Is this the example you showed above? Or can you provide the blurry results so I can better figure out the cause of the blurriness.
Hi ,
Thanks for making the work public. I find that your code works only when there is global movements, when the movement is local then the control points dont optimize well.
For the input video below
https://github.com/user-attachments/assets/7a9f28fc-9780-4945-812a-a97339064612
I am getting this output
https://github.com/user-attachments/assets/3ce19bb4-eb7e-43d2-8d0e-cf2432f88140
with the control poitns as this
https://github.com/user-attachments/assets/97131df2-9798-486b-b27c-0aafbc1bee90
which means the local deformation is hard to model with sparse control points. Any idea how to solve this ?