Open sausage-333 opened 8 months ago
Thanks for your great work.
I want to know how long does pre-train take for 0.3B models. Can you share your experience for the cost of pre-training BEST-RQ? (batch size, GPU you used, # of that GPU, training time, etc...)
Thanks for your great work.
I want to know how long does pre-train take for 0.3B models. Can you share your experience for the cost of pre-training BEST-RQ? (batch size, GPU you used, # of that GPU, training time, etc...)