I have generated pretraining data using the given steps in this repo.
I am doing this for the Hindi language with 22gb of data. Generating pretraining data itself took 1 month!
So I have meta_data file associated with each tf.record file. I have added all the train_data_size values from all the meta_data files to make one meta_data file because in run_pretraining.py requires it. So my final meta_data file which looks something like this:
So total_train_examples is 596972848 hence I am getting num_train_steps to be 9327700 with batch size of 64 and with 1 epoch only. I saw that in readme here num_train_steps=125000. I am not getting whats went wrong here.
With such huge train steps, it will take forever to train Albert. Even if I make batch size to 512 with 1 epoch only the training step will be 1165962 which is still huge!
As Albert was trained on very huge data why there are only 125000 steps only?
Want to know-how many epochs are there in Albert training for English?
Can anyone suggest what went wrong and what should I do now?
I have generated pretraining data using the given steps in this repo. I am doing this for the Hindi language with 22gb of data. Generating pretraining data itself took 1 month! So I have
meta_data
file associated with each tf.record file. I have added all thetrain_data_size
values from all themeta_data
files to make onemeta_data
file because inrun_pretraining.py
requires it. So my finalmeta_data
file which looks something like this:Here number of training steps are calculated as below:
num_train_steps = int(total_train_examples / train_batch_size) * num_train_epochs
So
total_train_examples
is 596972848 hence I am gettingnum_train_steps
to be 9327700 with batch size of 64 and with 1 epoch only. I saw that in readme herenum_train_steps=125000
. I am not getting whats went wrong here.With such huge train steps, it will take forever to train Albert. Even if I make batch size to 512 with 1 epoch only the training step will be 1165962 which is still huge! As Albert was trained on very huge data why there are only 125000 steps only? Want to know-how many epochs are there in Albert training for English?
Can anyone suggest what went wrong and what should I do now?