amazon-science / mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
https://arxiv.org/abs/2302.00923
Apache License 2.0
3.77k stars 309 forks source link

Question on fine-tuning time #72

Open JunseokLee42 opened 8 months ago

JunseokLee42 commented 8 months ago

Thank you for sharing the paper and code. While reading the Experimental Settings section in the 5.2 Implementation, I have a question about fine-tuning time.

Could you please let me know approximate fine-tuning time for Multimodal-CoT if you remember?

I am trying to understand the paper and code for re-implementation. However, due to limited computing resources(no multi-GPUs), I have to use cloud services. This has led me to calculate the approximate fine-tuning time, as cloud companies charge based on hour.

cooelf commented 4 months ago

Hi, it may need 8/24 hours to train a base/large model using an A100 GPU, respectively. This may also depend on the exact GPU. As it has been a long time after the training, I could not ensure if I remember it accurately. An efficient way would be running the code and the log will show the approximate fine-tuning time.