autonomousvision / differentiable_volumetric_rendering

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"
http://www.cvlibs.net/publications/Niemeyer2020CVPR.pdf
MIT License
801 stars 91 forks source link

reconstruction from natural images #37

Closed athena913 closed 4 years ago

athena913 commented 4 years ago

Hello, Thank you for making your code available for public use. I tried your pretrained combined model with some online images from the car and chair categories (I picked these categories since they were included in the demo categories). Each image has a single object and white background as mentioned in the github site. I have attached some of the test images and corresponding reconstructed 3d output. Is there any other image preprocessing that can be done to improve the visual quality and level of detail of the output or is this the expected output of the pretrained model? test.zip

thank you

m-niemeyer commented 4 years ago

Hi @athena913 ,

thanks a lot for your message and interest in the project! I agree that testing on real data is very interesting, but in this project we have not tested our model on "out-of-distribution data" (like in your scenario). In general, making the input images as similar as the training data should improve the results: Hence, for example the cars are not centered in your images, so adding appropriate padding to center the object could help. However, a drastic improvement on this kind of data can only be achieved by re-training the model with different images and/or applying data augmentation to the input images to make the model more robust to new data.

Good luck with your research!

Best, Michael

albertotono commented 3 years ago

Thanks @m-niemeyer for the explaination, in this case you trained the model without 3d supervision, so the training data are only images, and 2.5D images, correct?! So basically we should retrain the model with only these data and it can only work with the same images. Could you recommend a project that does a perfect job in 3d reconstruction with data only "in-of-distribution" data?!

For example for Occupancy Network, can we optimize the model to perfectly fit the 3d models used during training. So when we provide different pictures of the same model they will perfectly recognize the model used during training?! Is it possible?

m-niemeyer commented 3 years ago

Mh, I am not entirely sure if I understand your question correctly.

If you have multiple images of the same object, then you also train a single model on these images. Then you will get a reconstruction of this object only. What you now looked at are models which are trained on a large data set of cars, chairs, etc., and then you provide new images at test time and the model can predict 3D geometry. To get a good prediction, your test images should be as similar as possible to the training images.

What could further improve results is to train category-specific. You can train one model on a large dataset of chairs, and then at test time you provide images of chairs to this model. Then, you have a stronger bias to get a chair-like 3D geometry.

Good luck!