Closed fredshentu closed 4 months ago
Thanks for your interest! We used 6 A100s and a batch size of 1080 for our experiments on CALVIN. This link includes the tensorboard of training logs and the text file of test logs (eval_logs_64999/result.txt
). The text file of our test logs shows the number of completed tasks for each instruction chain.
We didn't spend much effort tuning the hyper-parameters for our experiments on CALVIN. I would recommend you to tune the other parameters (e.g. learning rate, weight decay ...) when having a smaller batch size.
Hi thx for your excellent work!
I am trying to reproduce the calvin training results on a two GPU machine with a slightly smaller batch size (using the
train_trajectory_calvin.sh
script). However, I can not get the same final eval performance compare to the model weights you provided. I wonder if you can also share the training log. It will be very helpful.Thanks!