Open WorldHellooo opened 1 year ago
I am curious as well.
We used two 8*A100 40G servers for training, one as an index server and one as a training server. If you want to train a larger model, 1 index server can also be used, just increase the number of training servers
Thanks for making your work public! Want to know how many computing resources were used for training and retrieval when you train the GPT-125M model?