Open msperber opened 6 years ago
I agree that this would be nice. And I actually don't understand why we couldn't have experiment names? I think we could have two options for syntax:
!SimpleExperiment
name: my_name
...
or
my_name: !SimpleExperiment
...
If we choose the latter, the serializer could check that the top level in the dictionary only has one element and that it is of type experiment.
Is this fixed now? I'm not sure...
No, I think nothing has been done along these lines yet.
Making config files and saved experiments compatible has been implemented by #491.
Some thoughts on what would need to be done to support resuming crashed experiments:
best
and last
versions of the experiment: When resuming, we load last
, when loading using !LoadSerialized
we load best. When the experiment has finished, last
could be deleted.__init__
. For most fields this would be easy, but starting out from the correct sentence and in the correct order requires some thought because checkpoints don't need to correspond to epochs etc.I think having one model per checkpoint is very reasonable. For example, tensorflow also do the same thing. Or if our concern is the disk space, maybe we can add flag to turn off this setting with the consequence that we can't resume the training.
It would be nice to do the following:
!Experiment
, whereas saved models are a single!Experiment
.TrainingRegimen
, but possibly also preprocessing and evaluation) that is stored as part of the model whenever the model is saved. If at initialization time the state is not zero, we fast-forward to the specified state.This would allow the following:
xnmt /location/to/saved_model.mod
(if the saved experiment had been completed, this would result in XNMT exiting without doing anything).A config file would probably look like this for a config file with a single experiment:
Or for a series of experiments (which is more similar to the current config files which always contain a series of experiments):
I believe this would be relatively easy to do, the main thing I'm not sure about how to handle best is that we would no longer have experiment names so
{EXP}
may no longer work.