compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 19 forks source link

Missing spectrum titles in MS Amanda output #515

Closed JuneCoco closed 1 year ago

JuneCoco commented 1 year ago

I use the peptide-shaker with command line to analyze mzid files that are generated by MS Amanda2.0 standalone version. I place all identification files, spectrum files, and fasta file in the same directory. However, an error message appears saying that it cannot find the mzid files. Could you please help me figure it out? Thanks. This is the command I am using: java -cp ./PeptideShaker-2.2.23/PeptideShaker-2.2.23.jar eu.isas.peptideshaker.cmd.PeptideShakerCLI -reference EL4 -fasta_file ./PXD009064/EL4_4__EL4PP97_id_139_concatenated_target_decoy.fasta -identification_files "./PXD009064/EL4_4K_R1_250M_280214_1_output_1.mzid.gz,./PXD009064/EL4_4K_R2_250M_280214_1_output_1.mzid.gz,./PXD009064/EL4_4K_R3_250M_280214_1_output_1.mzid.gz" -spectrum_files "./PXD009064/EL4_4K_R1_250M_280214_1.mzML,./PXD009064/EL4_4K_R2_250M_280214_1.mzML,./PXD009064/EL4_4K_R3_250M_280214_1.mzML" -id_params ./PXD009064/test5.par -out ./PXD009064/pepshk -output_mgf 1 -project_type 1

Wed May 24 17:38:23 CST 2023 Import process for EL4

Wed May 24 17:38:23 CST 2023 Importing sequences from EL4_4__EL4PP97_id_139_concatenated_target_decoy.fasta. Wed May 24 17:38:34 CST 2023 Importing gene mappings. Wed May 24 17:41:13 CST 2023 Establishing local database connection. Wed May 24 17:41:13 CST 2023 Reading identification files. Wed May 24 17:41:13 CST 2023 Parsing EL4_4K_R1_250M_280214_1_output_1.mzid.gz. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Wed May 24 17:41:17 CST 2023 Checking spectra for EL4_4K_R1_250M_280214_1_output_1.mzid.gz. Wed May 24 17:41:17 CST 2023 Spectrum with title '' in file named 'EL4_4K_R1_250M_280214_1' required to parse 'EL4_4K_R1_250M_280214_1_output_1.mzid.gz' not found. Wed May 24 17:41:17 CST 2023 PeptideShaker Processing Canceled.

PeptideShaker processing canceled. Please see the PeptideShaker log file: /home/PeptideShaker-2.2.23/resources/PeptideShaker.log
hbarsnes commented 1 year ago

If I remember correctly, PeptideShaker does not support the parsing of mzid files from MS Amanda. The reason being that they use a way of mapping back to the spectra that we are not able to interpret correctly. That should be the reason for the error you are seeing:

Spectrum with title '' in file named 'EL4_4K_R1_250M_280214_1' required to parse 'EL4_4K_R1_250M_280214_1_output_1.mzid.gz' not found.

In other words, it's not the spectrum file that is not being found but rather one of the spectra referred to in the mzid file that cannot be found in the provided mzML files.

Can you try switching to the MS Amanda tsv output and see if that solves the problem?

JuneCoco commented 1 year ago

If I remember correctly, PeptideShaker does not support the parsing of mzid files from MS Amanda. The reason being that they use a way of mapping back to the spectra that we are not able to interpret correctly. That should be the reason for the error you are seeing:

Spectrum with title '' in file named 'EL4_4K_R1_250M_280214_1' required to parse 'EL4_4K_R1_250M_280214_1_output_1.mzid.gz' not found.

In other words, it's not the spectrum file that is not being found but rather one of the spectra referred to in the mzid file that cannot be found in the provided mzML files.

Can you try switching to the MS Amanda tsv output and see if that solves the problem?

I believe that PeptideShaker should be capable of recognizing the mzid file format. This is due to the fact that SearchGUI generates MS Amanda result files in the mzid.gz format in searchgui_out.zip, which suggests that PeptideShaker should be able to handle this format as well. Also, I did encounter an issue when attempting to use a CSV file for identification results, as I received an error message indicating that PeptideShaker was unable to recognize the file. Besides, MS Amanda output files are only available in csv and mzid formats.

hbarsnes commented 1 year ago

PeptideShaker does support mzid files from other search engines, just not the ones from MS Amanda (again, if I remember correctly).

If you rename the .csv files into .ms-amanda.csv they should be loaded in PeptideShaker. This is how we label them in SearchGUI and how we are able to know that they come from MS Amanda.

JuneCoco commented 1 year ago

PeptideShaker does support mzid files from other search engines, just not the ones from MS Amanda (again, if I remember correctly).

If you rename the .csv files into .ms-amanda.csv they should be loaded in PeptideShaker. This is how we label them in SearchGUI and how we are able to know that they come from MS Amanda.

I renamed the CSV file into .ms-amanda.csv. The error message continues to appear. Is there any issue with what I have done? "" Tue May 30 18:25:29 CST 2023 Spectrum with title '' in file named 'MHC_10H118_Rep1' required to parse 'MHC_10H118_Rep1_output.ms-amanda.csv' not found. Tue May 30 18:25:29 CST 2023 PeptideShaker Processing Canceled. ""

hbarsnes commented 1 year ago

Would it be possible for you to share the files so that I can try to reproduce the issue on my side and hopefully find a solution?

JuneCoco commented 1 year ago

Would it be possible for you to share the files so that I can try to reproduce the issue on my side and hopefully find a solution?

I have sent the test files to your email address (harald.barsnes@uib.no). I hope this will help. Thanks a lot.

hbarsnes commented 1 year ago

Thanks for sharing the files!

The issue is that the spectrum title column is empty in the MS Amanda csv file. Hence the issue is in MS Amanda and not in PeptideShaker, as there is no way we can map the peptide identifications back to the spectra without the spectrum title. (I'm assuming the same is the case for the mzIdentML files from MS Amanda as well, but I would need the files to confirm.)

I don't see any obvious issues with the mzml files.

You can perhaps try converting the mzml to mgf and run the search again to see if that helps? Or simply switch to a different search engine?

You can also contact the MS Amanda developers directly here: https://groups.google.com/g/msamanda. Maybe they have seen the problem before and know how to fix it.

hbarsnes commented 1 year ago

Issue closed as the solution seems to be in MS Amanda and not in PeptideShaker.