nickgkan / butd_detr

Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
Other
74 stars 11 forks source link

Regarding pre-training details #10

Closed soham-joshi closed 1 year ago

soham-joshi commented 1 year ago

Hi @nickgkan , great work by the team!

I had a doubt in pre-training for the 3D architecture. For 2D BUTD-DETR architecture, the pre-training details are specified in the paper. I wanted to experiment with the pre-training procedure for BUTD-DETR 3D. Could you share the details (eg. dataset, scripts) for the pre-training, please?

Thanks!

ayushjain1144 commented 1 year ago

Hi,

We do not do any pre-training in our 3D experiments. We directly train on SR3D/NR3D/ScanRefer.

soham-joshi commented 1 year ago

Okay, Thanks for the response @ayushjain1144 !

Moreover, it would be helpful if you could share the script for the experiments in Table 6, Table 7, and Table 8. I wanted to re-run those experiments with the exact configuration presented in the paper.

Thank you!

ayushjain1144 commented 1 year ago

I think you can use sh scripts/train_test_cls.sh for reproducing Table-6 and Table-7. Before training, I would recommend evaluating the provided checkpoints to check everything is setup properly. You can refer to Section "Usage" to check how you can modify this script to train on different datasets SR3D/NR3D and evaluate the checkpoints.

Feel free to let us know if something is unclear or if you face any issues.

soham-joshi commented 1 year ago

Yeah okay will do that; Actually wanted some details on --max_epochs (default is 400, is the same used for experiments?) and --lr_decay_epochs. Thanks!

ayushjain1144 commented 1 year ago

I see, sorry, we don't exactly remember those configurations. I think you should keep --max_epochs to a high value (which 400 already is), decrease lr when the accuracy stops increasing and ultimately early stop.

For SR3D in GT setup, we think you would need to decrease lr after 25th epoch. For NR3D, in DET setup it used to take around 80 epochs (more details in readme), so it might be something in similar range for GT setup (Table-6/7). I think we also store the start_epoch in the provided checkpoint dictionaries, so that can give you a hint too (start_epoch denotes how many epochs it has been trained on already)

soham-joshi commented 1 year ago

Okay, will do that. Thanks for the prompt responses!