compomics / searchgui

Highly adaptable common interface for proteomics search and de novo engines
http://compomics.github.io/projects/searchgui.html
40 stars 16 forks source link

Comet -- Error: No index found for enzyme Trypsin. #366

Open jeffsocal opened 11 months ago

jeffsocal commented 11 months ago

Ubuntu 20 LTS SearchCLI 4.3.1

Leaving enzyme restriction null in the params file results in a error for comet execution.

Suggested error location: CometProcessBuilder.java L266 Suggested fault: CometProcessBuilder.java > getEnzymeListing L901

searchcli.par: ... "restrictionBefore": [], "restrictionAfter": [],

hbarsnes commented 11 months ago

There are two trypsin versions included as defaults: "Trypsin" and "Trypsin_(no_P_rule)". It seems like it's the last one that you want? It includes the setting you list above, i.e. no restrictions before and after the cleavage site, and seems to work fine as far as I can tell. In other words, I'm not able to reproduce the no index found error.

Are you perhaps creating your own enzyme? Or editing the SearchGUI parameter file directly?

jeffsocal commented 11 months ago

Thanks for the immediate response. Interesting. Apologies, where can I find that in the documentation?

If I set the values to "Trypsin_(no_Prule)" such that ... "digestionParameters": { "cleavageParameter": "enzyme", "enzymes": [ { "name": "Trypsin(no_Prule)", "aminoAcidBefore": [ "R", "K" ], "aminoAcidAfter": [], "restrictionBefore": [], "restrictionAfter": [], "cvTerm": { "ontology": "PSI-MS", "accession": "MS:1001251", "name": "Trypsin(no_Prule)", "id": 0 }, "id": 0 } ], "nMissedCleavages": { "Trypsin(no_Prule": 2 }, "specificity": { "Trypsin(no_P_rule)": "specific" }, "id": 0 },

Then none of the other searches are able to complete. For example: ... INFO: Beginning tide-search. FATAL: '/home/jeff/software/SearchGUI-4.2.17/resources/temp/search_engines/tide/fasta-index' does not exist ...

I would prefer to set the enzyme, then also the "restrictionBefore": [], and "restrictionAfter": [], parameters independently. Any and all suggestions are welcome.

jeffsocal commented 11 months ago

Ahh! Typo. Missing the ")", Seems to work fine now for Tide and others ... ... "nMissedCleavages": { "Trypsin_(no_P_rule": 2 },

Thanks for the feedback. Still looking for the SearchCLI documentation on enzyme nomenclature.

hbarsnes commented 11 months ago

Still looking for the SearchCLI documentation on enzyme nomenclature.

I'm not sure if we ever documented this properly. I had a look at https://github.com/compomics/compomics-utilities/wiki/IdentificationParametersCLI, but the closest I could find was the listing of available enzymes via the -enzymes option (https://github.com/compomics/compomics-utilities/wiki/IdentificationParametersCLI#general-parameters).

However, we did also implement command line support for creating new enzymes: https://github.com/compomics/compomics-utilities/tree/master/src/main/java/com/compomics/cli/enzymes. Seems like this never made it into the wiki pages though. There currently also seems to be a bug resulting in null pointer when trying to run java -cp SearchGUI-4.3.1.jar com.compomics.cli.enzymes.EnzymesCLI (to see the list of parameters). I will see if I can manage to fix this (and update the wiki pages), but in the meantime you can get around the bug by adding -use_log_folder 0 at the end of the EnzymesCLI command line.

jeffsocal commented 11 months ago

Interesting. Most the other algorithms accept Trypsin_(no_P_rule), except Comet, I noticed Tide also has a warning.

java -cp SearchGUI-4.3.1.jar eu.isas.searchgui.cmd.SearchCLI -id_params trypNoP3ptm_0.1Da.par .. -comet 1 -tide 1

Thu Oct 19 15:59:50 UTC 2023 Indexing ...fasta for Tide. ... WARNING: 'custom-enzyme' was set: setting 'enzyme' to 'custom-enzyme'

Thu Oct 19 16:00:11 UTC 2023 Importing spectrum files. ... Thu Oct 19 16:00:11 UTC 2023 Error: No index found for enzyme Trypsin_(no_P_rule). Thu Oct 19 16:00:11 UTC 2023 An error occurred while running SearchGUI. Please contact the developers. Thu Oct 19 16:00:11 UTC 2023 The search or processing did not finish properly!

jeffsocal commented 11 months ago

The specificity is in using no underscore for "Trypsin (no P rule)". That seems to work with SearchCLI now for all algorithms.