Noble-Lab / casanovo

De Novo Mass Spectrometry Peptide Sequencing with a Transformer Model
https://casanovo.readthedocs.io
Apache License 2.0
100 stars 35 forks source link

Eliminate the eval command #355

Closed wsnoble closed 1 week ago

wsnoble commented 1 month ago

The eval command is just like the sequence command except that it requires annotated spectra. We should get rid of it, and then have the program automatically do evaluations if the user provides annotated spectra. If no validation set is provided, the evaluation should be done on the training set. Results can be printed to the log file, as is done currently.

bittremieux commented 1 month ago

have the program automatically do evaluations if the user provides annotated spectra

This is a bit tricky, because you'd need to inspect the file first to check whether there are peptide sequences included in order to know how to instantiate the data loader (annotated MGF or standard MGF).

Easier would be the alternative option that adds an --evaluate flag.

wsnoble commented 1 month ago

Yes, --evaluate is fine with me.

Lilferrit commented 1 month ago

Do we still want to output the sequence results file if the evaluate flag is set? If so I think a good apprach might be to add a boolean argument to _predict_impl (and predict) that runs validation if set to true, and then we can get rid of the _validate_impl and validate functions from the trainer all-together. This boolean flag would then just be passed from the command line down into Trainer._predict_impl.

Edit: Nevermind, I followed the function calls all the way into the PyLightning library without realizing it. I'll think of another approach.

wsnoble commented 1 month ago

Yes, we should output sequencing results if the flag is set.