compomics / ms2pip

MS²PIP: Fast and accurate peptide spectrum prediction for multiple fragmentation methods, instruments, and labeling techniques.
https://ms2pip.readthedocs.io
Apache License 2.0
35 stars 18 forks source link

Command-line argument for listing available models #200

Closed vrkosk closed 10 months ago

vrkosk commented 1 year ago

The known/available models are hardcoded in the MODELS variable in ms2pipC.py. It would be very useful to have a command-line argument like:

ms2pip --list-available-models

That simply lists the model names from MODELS. Maybe it could even list the URL/filename that the name maps to, so you can download the relevant files in advance.

(Similar to but not the same as issue #163.)

RalfG commented 1 year ago

Hi!

We are currently working on a drastically refactored version of the MS²PIP Python package (see branch v4.0.0). In this new version, I just added a includes a function download_models that allows you to let MS²PIP automatically download all models beforehand. If you do want to download model files fully manually, the full list can be found on https://genesis.ugent.be/uvpublicdata/ms2pip/.

I hope this fits your use case. If not, let me know if you would like additional features.

Best, Ralf

vrkosk commented 1 year ago

Sounds useful! Once some models have been downloaded to a directory, is there a reverse mapping I can use? For example, if a directory contains model_20190107_TTOF5600_train_B.xgboost, should I somehow parse the model name from the filename? If so, how? The naming convention is fairly consistent, but there are cases like model CIDch2 having filename model_20190107_CID_train_B.xgboost, and Immuno-HCD filename is model_20210316_Immuno_HCD_B.xgboost.

RalfG commented 10 months ago

Hi @vrkosk,

On the v4.0.0 branch, all model information is now available in a constant: ms2pip.constants.MODELS.

v4.0.0 should be reasonably stable. It is currently still missing support for spectrum output files other than CSV. I'm still waiting for that to make the general release.

Best, Ralf