Question on Training Dynamic Objects with Camera Errors

donydchen / mvsplat

🌊 [ECCV'24 Oral] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images

https://donydchen.github.io/mvsplat

MIT License

803 stars 39 forks source link

Question on Training Dynamic Objects with Camera Errors #68

Closed whikwon closed 1 week ago

whikwon commented 2 weeks ago

Thank you for sharing such a great project. I have a question regarding training with a custom dataset. The data I work with is coronary angiograms, which capture images of heart vessels, and I generally have camera parameters recorded from medical equipment (C-arm). However, due to slight measurement errors and the fact that the vessels are deformable and move, they can be considered dynamic objects. If I were to train using MVSplat, how significantly would these camera parameter errors affect the results?

I've looked into training methods for dynamic scenes, but due to the characteristics of medical images, feature matching does not work well, and generalization is difficult with a per-scene approach.

Are the camera parameters typically provided in public datasets or those calculated from dense images with SfM accurate enough?

Thank you in advance, and I look forward to your advice.

donydchen commented 1 week ago

Hi @whikwon, thanks for your interest in our work.

MVSplat is expected to work only on static regions since it relies on cost volume, which assumes epipolar regularisation. Hence, you might see blurry or inaccurate reconstructed moving areas when applying it to dynamic scenes (those static background areas should be OK).

MVSplat should be robust enough to handle slight measurement errors. For the datasets we reported on, including RE10K, ACID and DTU, camera poses are provided. But those provided camera poses are not ground truth either, they are automatically extracted using SfM.

When applying MVSplat to other customised datasets, make sure that you have 1) correctly aligned the camera parameters with our code base and 2) correctly set the near and far for the depth range. More detailed descriptions can be found at https://github.com/donydchen/mvsplat/issues/23#issuecomment-2085190160

whikwon commented 1 week ago

@donydchen Thank you for the kind response. Upon reflection, I think the ghosting issue mentioned in your comment might arise because the measurement errors differ slightly for each case. Since angiography is mostly measured using similar devices, it might be worth considering a method to learn the camera parameter offsets. Do you think it would be possible to try using a simple MLP layer to learn the camera parameter offset?

donydchen commented 1 week ago

@whikwon, you can give it a try. However, since the errors come from the ground truth measurements, I am not sure whether the unsupervised learning of offsets could help. You might want to check DrivingForward, which compares to MVSplat but uses an additional Pose Network to learn some camera pose information; perhaps its network architecture can be a good reference for your case.

whikwon commented 1 week ago

@donydchen Thank you so much for the advice. I'll give it a try and share the results if they turn out to be meaningful.