allenai / SciREX

Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122
Apache License 2.0
128 stars 29 forks source link

Question: How can I resume training for a partially trained model? #15

Open viswavi opened 4 years ago

viswavi commented 4 years ago

I tried training the coreference model (using the instructions in the README), but the training job was killed due to infra issues after the training for 17 full epochs (out of 19 requested epochs).

I'd like to train this model to completion, to reproduce the results in the paper, but would like to avoid spending the GPU resources to re-train the first 17 epochs again. Is there a way to do this with the allennlp training scheme?

successar commented 4 years ago

I am not sure. There is a --recover option you can pass to allennlp I believe