You mentioned that training took you 48 hours to complete 400K steps to convergence, but I can't find the specs of your training machine. Was that on GPU or CPU? How big?
I am training your model on Arabic CoNLL and it took me about 11 hours to complete 7600 steps only. A simple math shows that to reach 400K steps following this rate, I would need 578 hours! I am running on a machine that has 8 vcpus, and I wanted to check with you before questioning my training.
You mentioned that training took you 48 hours to complete 400K steps to convergence, but I can't find the specs of your training machine. Was that on GPU or CPU? How big? I am training your model on Arabic CoNLL and it took me about 11 hours to complete 7600 steps only. A simple math shows that to reach 400K steps following this rate, I would need 578 hours! I am running on a machine that has 8
vcpu
s, and I wanted to check with you before questioning my training.