matchms / ms2deepscore

Deep learning similarity measure for comparing MS/MS spectra with respect to their chemical similarity
Apache License 2.0
48 stars 22 forks source link

Change design to avoid mistakes in training and saving of models #138

Open florian-huber opened 1 year ago

florian-huber commented 1 year ago

Currently our MS2DeepScore model contains both a tensorflow model (.model) and a spectrum binner .spectrum_binner. This makes sense, but the rest of the code should be adapted to make the use of this more foolproof.

Some of the potential issues right now are:

niekdejonge commented 1 year ago

In the branch #129 I have implemented a method for training MS2Deepscore that should automatically store the .spectrum_binner information in the correct way. The important part is to save the MS2Deepscore model (including spectrum binner) followed by overwriting with the weights stored in the checkpointer to make sure the weigths of the best model are stored (instead of the last). However, not all tutorials have been updated to reflect this change.

florian-huber commented 1 year ago

Yes, that's true. But those functions only represent our default workflow, which made a lot of sense for use in MS2Query. But here, I was rather looking for something that is part of the MS2DeepScore model and hence still leaves the full parameter flexibility (binning choices, reference score bins, network dimensions etc.)

This could for instance be an own fit() method which as pre-defined options for checkpoints and early stopping.

niekdejonge commented 1 year ago

Yes that certainly makes sense. I was thinking by having a default function other developers could use this function as starting point before altering binning choices and network dimension.