Open JunseokLee42 opened 8 months ago
Hi, it may need 8/24 hours to train a base/large model using an A100 GPU, respectively. This may also depend on the exact GPU. As it has been a long time after the training, I could not ensure if I remember it accurately. An efficient way would be running the code and the log will show the approximate fine-tuning time.
Thank you for sharing the paper and code. While reading the Experimental Settings section in the 5.2 Implementation, I have a question about fine-tuning time.
Could you please let me know approximate fine-tuning time for Multimodal-CoT if you remember?
I am trying to understand the paper and code for re-implementation. However, due to limited computing resources(no multi-GPUs), I have to use cloud services. This has led me to calculate the approximate fine-tuning time, as cloud companies charge based on hour.