Closed johntiger1 closed 4 years ago
What you linked is internal documentation of a private method (it begins with an underscore), and it is not complete. It only points to what you would have to do to replace that particular method. See the code example here for what goes into saving and loading a model: https://allennlp-course.apps.allenai.org/building-your-model#3.
The section on what to do if you're not using config files isn't written yet, but the gist is that you have to create the model using the same constructor arguments as when the model was saved, then call that model.load_state_dict
method seen above, in addition to the vocabulary things that are shown in what I linked.
Thanks @matt-gardner . Will take a look. In the meantime I think I've found a way that simply pickles the dataset reader + vocab, but this is completely ad-hoc and likely brittle.
Yes, that's fine too, as long as all of your objects are pickle-able. Most ways of saving and loading are somewhat brittle, unless you have a standard format that includes configuration; hence our config file approach to things.
Thanks @matt-gardner that makes sense. I have my own opinions on the config-based approach (for instance, memory blow-up using allennlp-train
vs precise memory and garbage collection control via a code-first approach) but won't bore you with the details. Would love to give back and contribute something (a guide, opinion piece) after EMNLP though!
The only memory difference between the two approaches should be which Instances
are in memory when, and you should be able to control that with lazy
. If you're seeing drastic memory differences between the two, we'd like to know about them.
In this line, it says that we can use our model for inference by simply doing
torch.load(...)
.https://github.com/allenai/allennlp/blob/e52fea2801fefc07808ee2039a086a9abbf21a1e/allennlp/training/trainer.py#L910
However, don't we need to ensure certain things are consistent? For instance the mapping from tokens to indices, the instances yielded by the
dataset_reader
etc.I looked into here, and it seems like we need some archiving,
https://github.com/allenai/allennlp/blob/master/allennlp/models/archival.py
, but then why does the default training code not seem to use archives? How are they able to restore training without the dataset reader?