Closed mamscience closed 2 months ago
If it is helpful. I am currently training with a batch of 32, dataset with 500_000 rows.
Hardware: 1 x RTX A2000 9 vCPU 35 GB RAM
I am using 5% GPU and 20% RAM. I'm currently 4h into training and the tests I'm running are good.
Always insightful and good to know! Thanks
@mamscience have a look at the colab notebook here for fine-tuning: https://colab.research.google.com/drive/1uvTmh-pe1zO5TeaaRVDdoEWJ5dFDI-pA?usp=sharing
Apologies for the delay and thanks @mamscience for the comment. It took 22 hours to pretrain the current model.
Some details are mentioned in the paper in Page 6, quoted verbatim below:
For all the models trained in this paper, we use a single Nvidia Tesla-P100 GPU with 12 GB of memory, 4 CPU cores, and 24 GB of RAM
Congrats on this foundation model.
Your publication doesn't mention anything about training time, hardware requirements, and costs. Could you elaborate on this?
Best regards