Closed suyashkumar2409 closed 1 year ago
Hi, pretraining takes about 2-3 days, and fine-tuning takes about 4hrs-1day, depending on the tasks. Hope this would help.
Thank you for responding!
In the docs I saw that there is a main model and a lite model, and that the results between the main model and the lite model weren't significantly different. how much training does the lite model take?
The speed difference is not very large, but the lite model requires less GPU memory.
Hi @Walter0807, does finetuning on lite model requires 8 V100 or can it be achieved with something perhaps like just one V100
Thanks
No, ft does not require 8 cards; it depends on your task also.
Hi! I am an AI researcher at Georgia tech, and we have decided to replicate your results and develop them further. We are currently estimating the feasibility of this endeavour, given our limited time and computational resources, and were wondering whether you could guide us what computation resources, time and cost would it involve to train the pre-trained model from scratch.
While the paper does mention the use of 8 V100 GPUs, the training time is not mentioned, hence we can't calculate the cost involved either.
We want to estimate whether we want to extend your work from a pretraining POV, or finetuning POV, depending upon the answer.