alibaba-yuanjing-aigclab / GeoLRM

[NeurIPS 2024] Geometry-Aware Large Reconstruction Model for Efficient and High-Quality 3D Generation
Apache License 2.0
133 stars 4 forks source link

Implementation question? #4

Open krNeko9t opened 1 month ago

krNeko9t commented 1 month ago

Thanks for sharing your nice work! I have noticed that in geolrm_wrapper.py, both the serializer and lrm_generator have a dedicated image encoder, is it possible to share the encoder? And I don't understand why the encoder is NOT freezed? Dose it have to be optimized together with the GeoLRM?

LinShan-Bin commented 1 week ago

Thanks for your comment!

  1. We have tried to share the encoder but found that this will harm the performance. This is because the proposal transformer focuses on recovering coarse geometry but the reconstruction transformer needs to retrieve fine-grained details.
  2. Our experiment shows better performance when not freezing the image encoder. Our perspective is that DINOv2 is not explicitly trained with 3D data thus requiring further fine-tuning.