allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

incremental training on less than 2000 sentences #234

Closed mchari closed 4 years ago

mchari commented 4 years ago

To incrementally train Elmo, i have a training data of around 1200 sentences, all in one file. I am using num_epochs = 5 - all the other settings are the same as in options.json that I downloaded from bilm-tf git.

The process has been running for 24hrs on a single Pascal P6000 GPU. What is the expectation for runtime for such a training job ? I also see the message "Training for 5 epochs and 1501265 batches". The last batch processed was numbered 83300- which is just past the half-way mark. I don't see how many epochs have been processed.

Any ideas how I could speed up the training ?

mchari commented 4 years ago

Why are the number of batches so large - for just 1200 sentences ?

mchari commented 4 years ago

A look at bilm/training.py answered my question. I had not set n_train_tokens and the default value was way too high(likely the number of tokens that were in the original training data)