Closed wsnoble closed 2 months ago
Actually, this documentation is also inaccurate. If you use -o foo
, then the file foo.mztab
gets produced. So it's really asking for the root of the output filename.
I think that when we output checkpoint files, they should also include this root (if it's provided) as a prefix.
I think that when we output checkpoint files, they should also include this root (if it's provided) as a prefix.
I'm not sure about this. There's a separate argument to specify the checkpoint file name during training, namely --model
(as well as to load an existing model file). How would you resolve these two values then?
Are you sure that this is the behavior? I think I usually just see output ckpt files with names like epoch=2-step=150000.ckpt
.
In general, it seems like a bad idea to use the same option to specify both the input and the output filename.
I suggest that if the user provides --root foo
then the above would be foo.epoch=2-step=150000.ckpt'. Currently, if the user specifies
foo.ckpt`, how do the different output files get named? It seems like you'd have to have some logic to look for and strip off the ".ckpt" and then add in the epoch and step number to the name. Plus error handling if the user doesn't provide ".ckpt" at the end of the model name.
Good catch. The checkpoint files are actually still another factor. They're saved to the directory model_save_folder_path
in the config, but their file name can't be specified currently, it's just the standard epoch and step counters.
So:
output
specifies the root file name for the mzTab (depending on mode) and log (always) files.model
specifies the model used for inference, or, when training, the state from which to continue training.So what do we want to happen?
output
should clarify that this is the root file name for the mzTab and log files. Done in #276.model
behavior seems to be correct for inference.model
to signify? The output name of the newly obtained model weights? What if training resumes? How do we differentiate the initial model from the newly trained model?Did I get this right?
I would say that root
should specify the root of the file name for all output files, including checkpoint files.
Good point about model
being overloaded for input and output models. If I were doing this from scratch, I would probably use two distinct options, like model-in
and model-out
. But for backward compatibility maybe we just use model
for input and use the root
option to specify the output model name.
I added a --root_ckpt_name
option to the train sub-command that will add a root name to the checkpoint files, e.g. if the option is set as --root_ckpt_name foobar
than the checkpoint filenames will have the format foobar.epoch=2-step=150000.ckpt
. Otherwise the checkpoint files will have the original default format. The checkpoint files will still be saved to the model_save_folder_path
set in the config file.
I also just added an --overwrite_output
boolean flag that is false by default. If this flag isn't set then the Casanovo CLI will raise an error if one of the output files already exists, otherwise it will overwrite it. All of these changes are on the model-out
branch.
In the command line documentation, the --output option is only for the sequencing phase and doesn't reflect what outputs are produced during training.