MassBank / RMassBank

Playground for experiments on the official http://bioconductor.org/packages/devel/bioc/html/RMassBank.html
Other
12 stars 15 forks source link

Include scan range in RMassBank workflow #216

Closed he-ob closed 3 years ago

he-ob commented 5 years ago

See https://github.com/MassBank/MassBank-web/issues/138

@schymane will follow up.

tsufz commented 3 years ago

@pstahlhofen, you can find the scan range in the mzML (Thermo at least):

<scanWindow>
<cvParam cvRef="MS" accession="MS:1000501" name="scan window lower limit" value="100.0" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/>
<cvParam cvRef="MS" accession="MS:1000500" name="scan window upper limit" value="1500.0" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/>
</scanWindow>

Hence, I would prefer an automated procedure which parses it from the mzML files. Needs to be checked for TOFs. @sneumann any experiences?

tsufz commented 3 years ago

and @he-ob?

pstahlhofen commented 3 years ago

@sneumann can mzR be used to parse the scan range from .mzML files? I already checked out the reference manual and made trials using the .mzML files in RMassBankData. I Ran the following:

mz <- openMSfile('somefile.mzML')
info <- runInfo(mz)
names(info)

which yielded

"scanCount" "lowMz" "highMz" "dStartTime" "dEndTime"
"msLevels" "startTimeStamp"

In the reference manual it says: "runInfo will show a summary of the experiment as a named list, including scanCount, lowMz, highMz, startMz, endMz, dStartTime and dEndTime" As stated above, lowMz and highMz were present. However, they seemed to contain the lowest and highest actual peak mz, respectively, rather than the scan range given by the user. startMz end endMz were not given in my experiment. Are these what I'm looking for? If not, does mzR contain functionality to parse the scan range?

pstahlhofen commented 3 years ago

For .mzML files, the scan range is included now since 2d0aaa491bad427751e16a469c32bbecf8303def