chhh / MSFTBX

MS File ToolBox - tools for parsing some mass-spectrometry related file formats (mzML, mzXML, pep.xml, prot.xml, etc.)
Apache License 2.0
12 stars 4 forks source link

Get MS3 spectrum parent scan index #11

Closed wenbostar closed 5 years ago

wenbostar commented 5 years ago

I have an mzML file that contains both MS2 and MS3 spectra. I used the following code to extract MS3 spectra data. For each MS3 spectrum, I want to extract the scan number of its parent MS2 spectrum. This information can be found in the section of "precursorList" of the MS3 spectrum. Which function should I use?

MZMLFile source = new MZMLFile("test.mzML");
LCMSRunInfo lcmsRunInfo = null;
try {
    lcmsRunInfo = source.fetchRunInfo();
} catch (FileParsingException e) {
    e.printStackTrace();
}
source.setNumThreadsForParsing(cores);
MZMLIndex mzMLindex = null;
try {
    mzMLindex = source.fetchIndex();
} catch (FileParsingException e) {
    e.printStackTrace();
}

IScanCollection scans;
scans = new ScanCollectionDefault(true);
scans.setDataSource(source);
scans.loadData(LCMSDataSubset.WHOLE_RUN, StorageStrategy.STRONG);

TreeMap<Integer, IScan> num2scanMap = scans.getMapNum2scan();
Set<Map.Entry<Integer, IScan>> num2scanEntries = num2scanMap.entrySet();

int total_spectra = 0;

for (Map.Entry<Integer, IScan> next : num2scanEntries) {
    IScan scan = next.getValue();
    if (scan.getSpectrum() != null) {
        if (scan.getMsLevel() == 3) {
            // extract the scan number of the MS2 spectrum that this MS3 spectra derived from.
        }
    }
}

image

Here is an example raw file that contains both MS2 and MS3 spectra: https://www.ebi.ac.uk/pride/data/archive/2019/07/PXD010557/PT6374-1.raw.

chhh commented 5 years ago

The scan IDs in mzML are actually free-text, this is not a standard, and you can even change the way IDs are generated when converting to mzML with ProteoWizard. So I woudn't recommend trying to parse it directly as these things change over time and will vary depending on the version of msconvert used to perform the conversion.

Also, it is not guaranteed that those "parent" scans are even in the mzML file anymore, some people do the conversion and cut out all MS1 scans, for example.

That being said, there are 2 methods in PrecursorInfo class:

IScan ms3scan = scans.getMapMsLevel2index().get(3).getNum2scan().get(0);
PrecursorInfo precursor = ms3scan.getPrecursor();
String parentScanRefRaw = precursor.getParentScanRefRaw();
Integer parentScanNum = precursor.getParentScanNum();
wenbostar commented 5 years ago

Thank you so much for your detailed explanation. It's very helpful.