galaxyproteomics / tools-galaxyp

Galaxy Tool Shed repositories maintained and developed by the GalaxyP community
MIT License
34 stars 57 forks source link

PeptideShaker Issue Reading Identification files #412

Open pravs3683 opened 4 years ago

pravs3683 commented 4 years ago

@bgruening @CarlosHorro

Has anyone seen this issue in PeptideShaker before? PeptideShaker Error: "An error occurred while loading the identification files."

Would be great if anyone associated with PeptideShaker wrapper can take a look and fix this issue. Please see the stdout report below:

See error inflated: SEARCHGUI_IdentificationParameters.par Path configuration completed. Path configuration completed. Fri Dec 06 18:25:54 CET 2019 Unzipping searchgui_input.zip. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Fri Dec 06 18:37:15 CET 2019 Import process for Galaxy_Experiment_2019120618201575652808 (Sample: Sample_2019120618201575652808, Replicate: 1)

Fri Dec 06 18:37:15 CET 2019 Importing sequences from input_database.fasta. Reindexing: input_database.fasta. 10% 20% 30% 40% 50% 60% 70% 80% 90% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:31:39 CET 2019 FASTA file import completed. Fri Dec 06 21:31:39 CET 2019 Establishing local database connection. Fri Dec 06 21:31:44 CET 2019 Reading identification files. Fri Dec 06 21:31:44 CET 2019 Parsing Q23605_G2_gr120.comet.pep.xml. Fri Dec 06 21:32:06 CET 2019 Loading spectra for Q23605_G2_gr120.comet.pep.xml. Fri Dec 06 21:32:06 CET 2019 Importing Q23605_G2_gr120.mgf 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:32:27 CET 2019 Q23605_G2_gr120.mgf imported. 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:32:27 CET 2019 Collecting peptides to map. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:32:29 CET 2019 Mapping peptides to proteins. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:34:29 CET 2019 Importing PSMs from Q23605_G2_gr120.comet.pep.xml 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:35:06 CET 2019 3581 identified spectra (4.4%) did not present a valid peptide. Fri Dec 06 21:35:06 CET 2019 77538 of the best scoring peptides were excluded by the import filters: Fri Dec 06 21:35:06 CET 2019 - 48.3% peptide mapping to both target and decoy. Fri Dec 06 21:35:06 CET 2019 - 51.0% peptide presenting high mass or isotopic deviation. Fri Dec 06 21:35:06 CET 2019 Parsing Q23605_G2_gr120.omx. Fri Dec 06 21:38:27 CET 2019 Loading spectra for Q23605_G2_gr120.omx. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:38:27 CET 2019 Collecting peptides to map. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:38:29 CET 2019 Mapping peptides to proteins. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:40:05 CET 2019 Importing PSMs from Q23605_G2_gr120.omx 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:40:17 CET 2019 656 identified spectra (0.8%) did not present a valid peptide. Fri Dec 06 21:40:17 CET 2019 9439 of the best scoring peptides were excluded by the import filters: Fri Dec 06 21:40:17 CET 2019 - 40.9% peptide mapping to both target and decoy. Fri Dec 06 21:40:17 CET 2019 - 23.1% peptide length less than 6 or greater than 50. Fri Dec 06 21:40:17 CET 2019 - 36.0% peptide presenting high mass or isotopic deviation. Fri Dec 06 21:40:17 CET 2019 Parsing Q23605_G2_gr120.t.xml. Fri Dec 06 21:49:34 CET 2019 Loading spectra for Q23605_G2_gr120.t.xml. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:49:34 CET 2019 Collecting peptides to map. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:49:34 CET 2019 Mapping peptides to proteins. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fri Dec 06 21:51:33 CET 2019 Importing PSMs from Q23605_G2_gr120.t.xml 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:51:40 CET 2019 6439 identified spectra (8.4%) did not present a valid peptide. Fri Dec 06 21:51:40 CET 2019 16728 of the best scoring peptides were excluded by the import filters: Fri Dec 06 21:51:40 CET 2019 - 100.0% peptide mapping to both target and decoy. Fri Dec 06 21:51:40 CET 2019 File import completed. 235230 first hits imported (1976104 secondary) from 81931 spectra. Fri Dec 06 21:51:40 CET 2019 [224403 first hits passed the initial filtering] Fri Dec 06 21:51:40 CET 2019 Computing assumptions probabilities. 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 110% 120% 130% 140% 150% 160% 170% 180% 190% 200% Fri Dec 06 21:51:42 CET 2019 Saving assumptions probabilities. 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:52:01 CET 2019 Selecting best peptide per spectrum. 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:53:49 CET 2019 Computing PSM probabilities. 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:53:49 CET 2019 Scoring PTMs in PSMs (D-score and A-score) Fri Dec 06 21:53:49 CET 2019 Scoring PSM PTMs. Please Wait... 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:54:06 CET 2019 Resolving peptide inference issues. Fri Dec 06 21:54:06 CET 2019 Peptide Inference. Please Wait... 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:54:35 CET 2019 Saving probabilities, building peptides and proteins. Fri Dec 06 21:54:35 CET 2019 Attaching Spectrum Probabilities - Building Peptides and Proteins. Please Wait... 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:54:46 CET 2019 Simplifying protein groups. Fri Dec 06 21:54:46 CET 2019 Symplifying Protein Groups. Please Wait... 10% 20% 30% 40% Fri Dec 06 21:57:35 CET 2019 Removing Mapping Artifacts. Please Wait... Fri Dec 06 21:57:35 CET 2019 723 unlikely protein mappings found: Fri Dec 06 21:57:35 CET 2019 - 1218 protein groups supported by non-enzymatic shared peptides. Fri Dec 06 21:57:35 CET 2019 - 12 protein groups explained by peptides shared to less confident mappings. Fri Dec 06 21:57:35 CET 2019 - 355 groups explained by a simpler group. Fri Dec 06 21:57:35 CET 2019 Note: a group can present combinations of these criteria. Fri Dec 06 21:57:35 CET 2019 Generating peptide map. Fri Dec 06 21:57:35 CET 2019 Filling Peptide Maps. Please Wait... 10% 20% 30% 40% Fri Dec 06 21:57:37 CET 2019 Computing peptide probabilities. Fri Dec 06 21:57:37 CET 2019 Estimating Probabilities. Please Wait... 10% 20% 30% 40% 50% 60% 70% Fri Dec 06 21:57:37 CET 2019 Saving peptide probabilities. Fri Dec 06 21:57:37 CET 2019 Attaching Peptide Probabilities. Please Wait... 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:57:37 CET 2019 Generating protein map. Fri Dec 06 21:57:37 CET 2019 Filling Protein Map. Please Wait... 10% 20% 30% 40% 50% 60% 70% 80% 90% Fri Dec 06 21:57:45 CET 2019 Resolving protein inference issues, inferring peptide and protein PI status. Fri Dec 06 21:57:45 CET 2019 Simplifying Redundant Protein Groups. Please Wait... 10% **Fri Dec 06 22:08:34 CET 2019 Inferring PI status, sorting proteins. Please Wait...

PeptideShaker processing failed. See the PeptideShaker log for details.

Fri Dec 06 22:08:38 CET 2019 An error occurred while loading the identification files. Fri Dec 06 22:08:38 CET 2019 Please see the error log (Help Menu > Bug Report) for details. Fri Dec 06 22:08:44 CET 2019 PeptideShaker Processing Canceled.

PeptideShaker processing canceled. Please see the PeptideShaker log file: /data/dnb02/galaxy_db/job_working_directory/006/486/6486267/working/bin/resources/PeptideShaker.log**
pravs3683 commented 4 years ago

@bgruening @CarlosHorro

One update: I was trying to investigate into this issue. After unzipping the SearchGUI archive file, I found that there was no mzid file (MSGF output). I had used 4 search engines including MSGF+. I also checked other SearchGUI archives (ran on smaller datasets) where I have used MSGF+ (including others used here) and found mzid file in there.

It seems that when SearchGUI is run with MSGF+ on a very large database, it fails to generate output from MSGF+ (mzid file). I am doing some more tests. Running SearchGUI on the same dataset, this time without using MSGF+. I will update the outcome soon. Hope this helps.