qizekun / ShapeLLM

[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
https://qizekun.github.io/shapellm/
Apache License 2.0
144 stars 10 forks source link

Reproducing ReCon++ #17

Closed atharva-loci closed 2 months ago

atharva-loci commented 2 months ago

Hi, Thanks for sharing the great paper and code!

I'm trying to reproduce the ReCon++ Base results using your code and the data from OpenShape but can't seem to surpass ~47% on Objaverse LVIS zero-shot top 1 accuracy whereas the paper reports 53.2% for the base model.

I'm using the config file in ReConV2/cfgs/pretrain/base/openshape.yaml is this the exact some config used for the best_lvis checkpoint? Specifically, is the stop_grad set to True correct? The appendix of the paper says: "Additionally, to enhance global classification and retrieval capabilities, we backpropagate gradients from the global branch to the local branch in open vocabulary zero-shot experiments, as demonstrated in the ablation experiments in Tab. 15" which implies stop_grad is set to False for these training runs. Any other major differences? Is there a reconstruct checkpoint available?

Any help would be appreciated - thanks!

qizekun commented 2 months ago

Hi!

Thanks for your attention to our project. We use stop_grad=False in the zero-shot training and fine-tuning stage. It does show better performance in some cases.