MassBank / RMassBank

Playground for experiments on the official http://bioconductor.org/packages/devel/bioc/html/RMassBank.html
Other
12 stars 15 forks source link

Add check against peaks > precursor to validation suite #57

Closed sneumann closed 10 years ago

sneumann commented 10 years ago

Issue by sneumann from Wednesday Jan 15, 2014 at 09:46 GMT Originally opened as https://github.com/sneumann/RMassBank/issues/49


Emma requested in #43 to add another check, we're just extracting that into a new issue.

"Also, add a feature in the validation Suite to catch this occurrence (peaks > precursor). "

We should then check against all of MassBank OpenData if it occurs rightfully somewhere because 1) There are isotope peaks (so maybe only check peaks > precursor+4) 2) There are doubly charged precursors, but according to http://www.massbank.jp/manuals/MassBankRecord_en.pdf there are no [...]2+ , so is might be a non-problem.

sneumann commented 10 years ago

Comment by schymane from Wednesday Jan 15, 2014 at 13:12 GMT


So, we have a few separate issues: 1) detect the case of assigning formulas to noise peaks > precursor in "old" RMassBank versions via the reanalyzepeaks option in the validation suite 2) fix the code to prevent this in the future (send reanalyzepeaks > precursor into fail peak list for manual checking - Michele to do, see #43 3) Extend RMB to deal with isotopes, see #50

wrt 1), check for records processed with RMassBank equal to or lower than current version number, e.g. MS$DATA_PROCESSING: WHOLE RMassBank (old style, no version number) MS$DATA_PROCESSING: WHOLE RMassBank 1.3.1 and then check for peaks > precursor mass. MS$FOCUSED_ION: PRECURSOR_M/Z 207.138 PK$ANNOTATION: m/z num {formula mass error(ppm)} 207.1372 1 C13H19O2+ 207.138 -3.7 209.0114 1 C12H3NO3+ 209.0107 3.28 PK$PEAK: m/z int. rel.int. 207.1372 3170.4 66 209.0114 3973.1 82 and warn the user to check. That'd catch cases with the current issue. And since RMB can't yet deal with isotopes, no need to overcomplicate and look for patterns at this stage, any spectra that had been processed with earlier RMB versions will have meaningless formulas everywhere.

MS/MS isolation width isn't yet in the MassBank records. Is this possible? Should we add it? Will potentially be needed for #50.

sneumann commented 10 years ago

There are two other issues tracking this: isotopes #58 and restrict analyse peaks in #51 so this issue can be closed. The "assigning formulas to noise peaks > precursor" was worked around differently.