Closed h-ccc closed 1 year ago
OK. We'll have a check. This is possible as prefix tuning is sensitive to hyperparameters.
Thanks a lot in advance!!
@h-ccc Hi, Thank you for your attention and support in our work. If you are using a single machine single card training, you can expand the batch to 256/512 through update-freq , aligning with batchsize in Github.
Thank you for your reply! Should I set batch-size to 256/512? But it will lead to the "out of memory" issue. For the hyperparameters of batch-size and update-freq, I followed the settings in train_refcoco_prefix.sh. Concretely, I tried running with batch-size of 8 on two 2080-Ti or with batch-size of 16 on a single A100 for the base model. update-freq is always set to 8.
@h-ccc Hi, you can expand the batch size by using the update-freq (multiple forward, once backward)
I will try it, Thanks a lot for answering !
Hi @h-ccc , I have a similar problem to you. I tried to use prefix tuning to train OFA on a single 4090 gpu, but I found that my training was not convergent. The loss dropped to about 8.4 and would no longer decline, and the grounding accuracy was only about 6%. I also tried your parameter settings, but it still doesn't work. Could you please send me your complete training configuration .sh file? zarath_xuany@163.com. Thank you very much!
It took a long time to train OFA with prefix tuning (several days on a single A100). The hyperparameters I used are shown above, copied directly from my .sh file. You can check your checkpoint file (ofa_base.pt) and your dataset. I think that the 6% accuracy is not solely attributed to the hyperparameters.
@h-ccc , Thank you for your reply. I downloaded the ofa_base.pt and the dataset file (.tsv) from the link provided by OFA, and I think there should be no mistake. I must have been wrong about some of the details, so the training didn't converge. So, could you please send me your .sh file so I can try it on the single gpu to find out more about the problem. Thanks again!
Thank you very much for your outstanding work. I was very inspired by OFA. When I tried to reproduce the results of prefix tuning on Visual Grounding, I encountered some performance problems. For example, on Refcoco+, the result of my reproduction is 75.17/80.61/65.94 (v. s. 76.34/81.44/67.68 reported in Prompt Tuning for Generative Multimodal Pretrained Models). The following is the hyperparameter I set according to
train_refcoco_prefix.sh
. Can you provide the corresponding checkpoints or help me point out my mistakes? Thank you so much