erobic / ramen

This is a pytorch implementation of our Recurrent Aggregation of Multimodal Embeddings Network (RAMEN) from our CVPR-2019 paper.
17 stars 6 forks source link

Reproducing baseline results #4

Open zfying opened 2 years ago

zfying commented 2 years ago

I have some trouble reproducing the baseline results with RN and UpDn on CLEVR. Can you share the scripts for reproducing the baseline results (RN, UpDn, etc.) with the precise hyperparameters used (batch size, learning rate, etc.)?

erobic commented 2 years ago

I uploaded the original scripts for RN and UpDn to: https://drive.google.com/drive/folders/1-lD4wDWNSh3n1DsSLA8zAPNB0US7jg8U?usp=sharing

Note that these are not well documented/maintained anymore, but hopefully will help you reproduce the results.

For relation network on CLEVR, this was the script specifying the hyperparameters (see: rn_CLEVR.sh):

CUDA_VISIBLE_DEVICES=0 python -u train_rn.py --root $ROOT --data_set $DATA_SET --model 'original-fbuf' \
--batch-size 128 \
--test-batch-size 128 \
--num_objects 15 \
--lr 5e-6 \
--lr-step 10 \
--lr-gamma 2 \
--lr-max 0.0005 \
--epochs 63 \
--invert-questions \
--rl_in_size 5120 \
--feature_subdir faster-rcnn \
--expt_name expt_CLEVR

For UpDn on CLEVER, I had used the following script (see butd_CLEVR.sh): Lr was not tuned i.e., Adamax optimizer with default lr (i.e., 2e-3) was used.

CUDA_VISIBLE_DEVICES=0 python -u butd_vqa.py --root $ROOT \
--dataset $DATASET \
--spatial_feature_type mesh \
--spatial_feature_length 16 \
--batch_size 64 \
--num_objects 15 \
--h5_prefix use_split \
--expt_name expt_${DATASET}_new \
--feature_subdir faster-rcnn \
--resume latest