dhmay / param-medic

Param-Medic breathes new life into MS/MS database searches by optimizing parameter settings to your data.
7 stars 1 forks source link

Error processing 2016_Jan_12_QE2_51.mzML #3

Closed edeutsch closed 5 years ago

edeutsch commented 5 years ago

Hi, I have installed Param-medic (starting from trunk in repo) and it works great on most of the example mzML files at: https://noble.gs.washington.edu/proj/param-medic/ I get what appears to be good output for 9 of the 11 mzML files (Linux

But I get an error on 2016_Jan_12_QE2_51.mzML

python --version Python 2.7.16

/proteomics/sw/python/param-medic/bin/param-medic 2016_Jan_12_QE2_51.mzML 2019-08-20 21:16:40,986 INFO: Processing input file 2016_Jan_12_QE2_51.mzML... 2019-08-20 21:16:47,316 INFO: processed 0 total spectra in 6.3 seconds... 2019-08-20 21:16:49,008 INFO: processed 1000 total spectra in 8.0 seconds... 2019-08-20 21:16:50,127 INFO: processed 2000 total spectra in 9.1 seconds... 2019-08-20 21:16:50,996 INFO: processed 3000 total spectra in 10.0 seconds... ... 2019-08-20 21:18:01,622 INFO: processed 90000 total spectra in 80.6 seconds... 2019-08-20 21:18:02,839 INFO: processed 91000 total spectra in 81.9 seconds... Did not find spectra from known charges, so looking for unknown-charge spectra. Charge 0 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0

Charge 2 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0

Charge 3 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0

Charge 4 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0

2019-08-20 21:18:04,869 INFO: Precursor and fragment error summary: 2019-08-20 21:18:04,869 INFO: Precursor error calculation failed: 2019-08-20 21:18:04,869 INFO: Need >= 200 peak pairs to fit mixed distribution. Got only 0. Details: Charge 0 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0 Charge 2 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0 Charge 3 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0 Charge 4 Spectra in same averagine bin as another: 0 ... and also within m/z tolerance: 0 ... and also within scan range: 0 ... and also with sufficient in-common fragments: 0

2019-08-20 21:18:04,869 INFO: Fragment error calculation failed: 2019-08-20 21:18:04,869 INFO: Need >= 200 peak pairs to fit mixed distribution. Got only 0 Details:

2019-08-20 21:18:06,290 WARNING: SILAC: No counts for any control separation pairs! Cannot estimate prevalence of SILAC separations. 2019-08-20 21:18:06,291 INFO: iTRAQ: no reporter ions detected 2019-08-20 21:18:06,292 INFO: TMT: no reporter ions detected 2019-08-20 21:18:06,292 INFO: Phosphorylation: not detected 2019-08-20 21:18:06,292 INFO: No modifications detected requiring search parameter changes. file precursor_prediction_ppm precursor_sigma_ppm fragment_prediction_th fragment_sigma_ppm SILAC_4Da_present SILAC_4Da_statistic SILAC_6Da_presentSILAC_6Da_statistic SILAC_8Da_present SILAC_8Da_statistic SILAC_10Da_present SILAC_10Da_statistic iTRAQ_8plex_present iTRAQ_8plex_statistic iTRAQ_4plex_present iTRAQ_4plex_statistic TMT_6plex_present TMT_6plex_statistic TMT_10plex_present TMT_10plex_statistic TMT_2plex_present TMT_2plex_statistic phospho_present phospho_statistic 2016_Jan_12_QE2_51.mzML ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR F -1.1940 F -0.9100 F 0.4535 F 0F-1.1629 F -0.4832

Is this expected or unexpect behavior? i.e. is this a "bad" datafile and demonstrates what happens in some type of bad data (with no charges?)

Peeking in the mzML, it looks a little questionable. There appears to be a charge state, but no selection ion m/z? bad data?

...

....

Bug or bad data?

dhmay commented 5 years ago

Good catch. Param-Medic should work fine on the .ms2 version of the same file. We provided both for completeness, but the .ms2 version is the one we used for training. Looks like the mzML version has problems.

I don't have access to fix those files any more, but I'll pass on your comment.

dhmay commented 5 years ago

Bad file is removed.