compomics / searchgui

Highly adaptable common interface for proteomics search and de novo engines
http://compomics.github.io/projects/searchgui.html
40 stars 16 forks source link

Search cancelled #339

Closed mdondrup closed 1 year ago

mdondrup commented 1 year ago

I am trying to analyze MS data from our proteomics facility with searchGUI but the search is interrupted each time.

`Error: Source '2022-11-16_Exploris480_P22-41_2665/Raw/Cement1.comet.pep.xml' does not exist. I tried to deactivate the search tool but it still doesn't proceed. I am running searchgui under linux. Report file attached. searchgui-report.txt

hbarsnes commented 1 year ago

As far as I can tell from the log file there are two search engines failing: MyriMatch and Comet. For MyriMatch, the problem seems to be the FASTA file and the fact that there are duplicated protein ids:

Process #0 (kjempefuru.cbu.uib.no) had an error: [ProteinList_FASTA::createIndex] duplicate protein id "EMLSAP00000012094"

I recommend having a look at our database help wiki, and especially the section on non-standard headers: https://github.com/compomics/searchgui/wiki/DatabaseHelp#non-standard-fasta.

For the Comet issue, the problem is that the given mzML file is not supported by Comet as it is not indexed. If you have access to the original raw file you can redo the conversion to mzML with indexing turned on and Comet should work as well.

mdondrup commented 1 year ago

Hi Harald,

thanks for your help so far. I have changed the FASTA headers and I have now managed to run the Comet analysis but it needed a little workaround. To get indexed spectra, I have set the output format to .mzML (indexed) in ThermoRawFileParser, this creates a Cement1.mzML file and a Cement1.cms file. However, comet is looking for Cement1.mzml and fails. To check whether this is the cause of the problem after the failed run I created a symbolic link Cement1.mzml in the same directory and then run the process again which then succeeds.

Tue Nov 29 11:40:15 CET 2022        Converting raw files.
Tue Nov 29 11:40:15 CET 2022        Processing Cement1.raw with ThermoRawFileParser.

2022-11-29 11:40:15 INFO Started parsing /Home/ii/michaeld/2022-11-16_Exploris480_P22-41_2665/Raw/Cement1.raw 2022-11-29 11:40:28 INFO Processing 59353 MS scans 

2022-11-29 11:43:33 INFO Finished parsing /Home/ii/michaeld/2022-11-16_Exploris480_P22-41_2665/Raw/Cement1.raw 2022-11-29 11:43:33 INFO Processing completed 0 errors, 0 warnings 

Tue Nov 29 11:43:33 CET 2022        ThermoRawFileParser finished for Cement1.raw (3 minutes 18.0 seconds).

Tue Nov 29 11:43:33 CET 2022        Importing spectrum files.
Tue Nov 29 11:43:33 CET 2022        Importing spectrum file Cement1.mzml
Tue Nov 29 11:43:46 CET 2022        Importing spectrum files completed (13.1 seconds).

Tue Nov 29 11:43:46 CET 2022        Processing Cement1.mzml with Comet.

 Comet version 2022.01 rev. 2 (f447eed) 
 Error - input file "/Home/ii/michaeld/2022-11-16_Exploris480_P22-41_2665/Raw/Cement1.mzml" not found. 

Tue Nov 29 11:43:47 CET 2022        Comet finished for Cement1.mzml (23.0 milliseconds).

Tue Nov 29 11:43:47 CET 2022        Error: Source '/Home/ii/michaeld/2022-11-16_Exploris480_P22-41_2665/Raw/Cement1.comet.pep.xml' does not exist
Tue Nov 29 11:43:47 CET 2022        An error occurred while running SearchGUI. Please contact the developers.
Tue Nov 29 11:43:47 CET 2022        The search or processing did not finish properly!

Tue Nov 29 11:43:47 CET 2022        Searching Canceled!
mdondrup commented 1 year ago

Now, the next tool that is failing is MetaMorpheus. There, I just get Error:null

Tue Nov 29 12:55:07 CET 2022        Converting raw files.
Tue Nov 29 12:55:07 CET 2022        Cement1.mzml already exists. Conversion canceled.

Tue Nov 29 12:55:07 CET 2022        Importing spectrum files.
Tue Nov 29 12:55:07 CET 2022        Importing spectrum file Cement1.mzML
Tue Nov 29 12:55:07 CET 2022        Importing spectrum files completed (55.0 milliseconds).

Tue Nov 29 12:55:07 CET 2022        Processing Cement1.mzML with MetaMorpheus.

Tue Nov 29 12:55:07 CET 2022        Error: null
Tue Nov 29 12:55:07 CET 2022        An error occurred while running SearchGUI. Please contact the developers.
Tue Nov 29 12:55:07 CET 2022        The search or processing did not finish properly!

Tue Nov 29 12:55:07 CET 2022        Searching Canceled!
mdondrup commented 1 year ago

Then, there is another error from Tide, seemingly there are too many CPUs (144 threads) on that machine.

Tue Nov 29 13:14:52 CET 2022        Converting spectrum file Cement1.mzml for Tide.
Tue Nov 29 13:14:59 CET 2022        Processing Cement1.ms2 with Tide.

FATAL: Value of 'num-threads' must be between 0 and 64

USAGE:

  crux tide-search [options] <tide spectra file> <tide database>
hbarsnes commented 1 year ago

To get indexed spectra, I have set the output format to .mzML (indexed) in ThermoRawFileParser, this creates a Cement1.mzML file and a Cement1.cms file. However, comet is looking for Cement1.mzml and fails.

Thanks for finding this one! The file was indeed generated by default from ThermoRawFileParser as .mzML and not .mzml. This is not an issue on Windows, but clearly it is on Linux (and probably OSX). I've now added code that overwrites the default output file name into only using lower case for the filename extension part of the file name. You can find a beta version here for testing: https://genesis.ugent.be/maven2/eu/isas/searchgui/SearchGUI/4.2.0-beta/SearchGUI-4.2.0-beta-mac_and_linux.tar.gz

Now, the next tool that is failing is MetaMorpheus. There, I just get Error:null

Can you check the SearchGUI log file? You'll find it in the SearchGUI-4.1.23\resources folder. It's just named SearchGUI.log. Hopefully there will be a longer stack trace there that will allow us to understand what is happening. Could maybe be related to the mzML vs. mzml issue above?

Then, there is another error from Tide, seemingly there are too many CPUs (144 threads) on that machine.

Seems like this is a hard coded maximum value in Tide. I just checked that we include the latest version of Tide and we do. So your only option here would be to to reduce the number of CPUs/threads used. See the Edit > Processing Settings option, which defaults to the total number of CPUs on the system.

As a general comment, note that it is not usually necessary to run all of the search engines included in SearchGUI in order to get a good search result. In most cases, running just one or a couple of them will be more than enough.

mdondrup commented 1 year ago

Can you check the SearchGUI log file? You'll find it in the SearchGUI-4.1.23\resources folder. It's just named SearchGUI.log. Hopefully there will be a longer stack trace there that will allow us to understand what is happening. Could maybe be related to the mzML vs. mzml issue above?

The error is just:

Cannot run program "/Home/ii/michaeld/searchgui/SearchGUI-4.1.24/resources/MetaMorpheus/metamorpheus" (in directory "/Home/ii/michaeld/searchgui/SearchGUI-4.1.24/resources/MetaMorpheus"): error=2, No such file or directory

Indeed, the executable file does not exist even though there are other windows-like files (.ddl, CMD.exe) in this directory.

As a general comment, note that it is not usually necessary to run all of the search engines included in SearchGUI in order to get a good search result. In most cases, running just one or a couple of them will be more than enough.

I have, for now, deactivated Tide and MetaMorpheus search and got a search result file (seargui_out.zip). But now I am facing problems importing that into PeptideShaker. Should I open a new issue elsewhere?

hbarsnes commented 1 year ago

Indeed, the executable file does not exist even though there are other windows-like files (.ddl, CMD.exe) in this directory.

It seems like SearchGUI thinks it is running inside an Conda environment: https://github.com/compomics/searchgui/blob/f9243b45a15b145db2fa0523b0698c011f1998c1/src/main/java/eu/isas/searchgui/processbuilders/MetaMorpheusProcessBuilder.java#L219 (CONDA_APP_NAME = "searchgui")

Here is the specific test: https://github.com/compomics/compomics-utilities/blob/a6be2db5f904652c2531ac47cc3eb69f8c16c473/src/main/java/com/compomics/software/CompomicsWrapper.java#L938

Any idea why this is returning true on kjempefuru?

But now I am facing problems importing that into PeptideShaker. Should I open a new issue elsewhere?

Yes, please open a new issue in the PeptideShaker issue tracker so that we can deal with the separately.

mdondrup commented 1 year ago

Here is the specific test: https://github.com/compomics/compomics-utilities/blob/a6be2db5f904652c2531ac47cc3eb69f8c16c473/src/main/java/com/compomics/software/CompomicsWrapper.java#L938 Any idea why this is returning true on kjempefuru?

I am using conda even though I didn't install searchgui through it. I think the code checks for the environment variable CONDA_DEFAULT_ENV which gives 'base' for me, and likely does so for most that run conda. If I do unset CONDA_DEFAULT_ENV before running searchgui, MetaMorpheus finishes.

biocc commented 1 year ago

error.txt Hi, hbarsnes,

I am trying to analyze unspecific digested MS data (elastase and alP) with searchGUI, but the search is interrupted each time. MS-GF+ and X!Tandem were used to search the databases through the SearchGUI platform.

hbarsnes commented 1 year ago

@biocc Thanks for letting us know! You can find new beta releases that should solve the problem here: https://genesis.ugent.be/maven2/eu/isas/searchgui/SearchGUI/4.2.0-beta/SearchGUI-4.2.0-beta-windows.zip and https://genesis.ugent.be/maven2/eu/isas/searchgui/SearchGUI/4.2.0-beta/SearchGUI-4.2.0-beta-mac_and_linux.tar.gz

If this is not the case, please open a new issue where you also provide the SearchGUI error log. (As I cannot see that your issue is related to the original issue above?)

biocc commented 1 year ago

@biocc Thanks for letting us know! You can find new beta releases that should solve the problem here: https://genesis.ugent.be/maven2/eu/isas/searchgui/SearchGUI/4.2.0-beta/SearchGUI-4.2.0-beta-windows.zip and https://genesis.ugent.be/maven2/eu/isas/searchgui/SearchGUI/4.2.0-beta/SearchGUI-4.2.0-beta-mac_and_linux.tar.gz

If this is not the case, please open a new issue where you also provide the SearchGUI error log. (As I cannot see that your issue is related to the original issue above?)

Before your reply, i have solved this problem through SearchGUI-4.1.7. Thank you all the same.