compomics / ThermoRawFileParser

Thermo RAW file parser that runs on Linux/Mac and all other platforms that support Mono
Apache License 2.0
189 stars 50 forks source link

Isolation window changes? #152

Closed edeutsch closed 1 year ago

edeutsch commented 1 year ago

Hi, I looked through the issues and didn't see this discussed, so just want to be sure I understand:

I was just comparing a file that we converted using TRFP 1.3.2 and with 1.4.2, and one substantial difference I notice is: 1.3.2:

                <cvParam cvRef="MS" accession="MS:1000828" value="0.5" name="isolation window lower offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000829" value="0.5" name="isolation window upper offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />

vs. with 1.4.2:

                <cvParam cvRef="MS" accession="MS:1000828" value="1" name="isolation window lower offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000829" value="1" name="isolation window upper offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />

So it looks like the isolation offsets have doubled. Comparing with msconvert, it also has 1 as the value, so it appears that the new values match msconvert.

So is it true to say that for all TRFP conversions performed prior to v1.4.1, the isolation windows were too small by a factor of 2? Or is it more nuanced than that? The release notes say "Fixed a bug in handling isolation offset in mzML output". Can this be documented more specifically? It will be important to know if thousands of our previously converted data files have incorrect information.

thanks!

caetera commented 1 year ago

Hi @edeutsch, the error that has been fixed in 1.4.1 is only relevant if isolation offset is used in the raw file, i.e. the center of the isolation window is shifted from the isolated mass. The support for that feature has been introduced in 1.4.0 (4e3b469). The implementation in 1.4.0 (code below) had an error - the shifted offset was used both for upper and lower offset values in mzML.

                var offset = isolationWidth.Value / 2 + reaction.IsolationWidthOffset;
                precursor.isolationWindow.cvParam[1] =
                    new CVParamType
                    {
                        accession = "MS:1000828",
                        name = "isolation window lower offset",
                        value = offset.ToString(CultureInfo.InvariantCulture),
                        cvRef = "MS",
                        unitCvRef = "MS",
                        unitAccession = "MS:1000040",
                        unitName = "m/z"
                    };
                precursor.isolationWindow.cvParam[2] =
                    new CVParamType
                    {
                        accession = "MS:1000829",
                        name = "isolation window upper offset",
                        value = offset.ToString(CultureInfo.InvariantCulture),
                        cvRef = "MS",
                        unitCvRef = "MS",
                        unitAccession = "MS:1000040",
                        unitName = "m/z"
                    };
}

The error has been corrected in 1.4.1 by 3d899ea


It has been a while since 1.3.2, i.e. it is hard to say what is the reason for the different output. After a quick (and, honestly, rather shallow) dissection I can pinpoint ae7b0b087 (included in 1.3.3). The change prioritizes isolation window information from scan trailer (if it is present) over the information stored in the Reaction object provided by the API. To the best of my knowledge, the trailer was used to store isolation information in older (LTQ-XXX generation) instruments, more modern instruments (QE and Fusion generation) do not use the trailer method anymore. Thus, it could be that the trailer and Reaction were reporting different values on older instruments.

It would help if we can dissect (at least) the version when the change had happened. Would it be possible to share a representative raw file and (preferably) scan?


(Edited) Release notes in Wiki have been updated with some extra details.

edeutsch commented 1 year ago

Thanks for the response, but then that seems like it wouldn't account for what I'm seeing. There seems to be no offset in my file. I'm seeing this: v1.3.2:

              <isolationWindow>
                <cvParam cvRef="MS" accession="MS:1000827" value="441.297119140625" name="isolation window target m/z" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000828" value="0.5" name="isolation window lower offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000829" value="0.5" name="isolation window upper offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
              </isolationWindow>
              <selectedIonList count="1">
                <selectedIon>
                  <cvParam cvRef="MS" accession="MS:1000744" value="441.297119140625" name="selected ion m/z" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                  <cvParam cvRef="MS" accession="MS:1000041" value="2" name="charge state" />
                </selectedIon>
              </selectedIonList>

and v1.4.2:

              <isolationWindow>
                <cvParam cvRef="MS" accession="MS:1000827" value="441.297119140625" name="isolation window target m/z" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000828" value="1" name="isolation window lower offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                <cvParam cvRef="MS" accession="MS:1000829" value="1" name="isolation window upper offset" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
              </isolationWindow>
              <selectedIonList count="1">
                <selectedIon>
                  <cvParam cvRef="MS" accession="MS:1000744" value="441.297119140625" name="selected ion m/z" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                  <cvParam cvRef="MS" accession="MS:1000041" value="2" name="charge state" />
                  <cvParam cvRef="MS" accession="MS:1000042" value="8781.63757324219" name="peak intensity" unitAccession="MS:1000131" unitName="number of detector counts" unitCvRef="MS" />
                </selectedIon>
              </selectedIonList>

ah, I now see more in your edited post. The file is from an LTQ Orbitrap, it is:

mzspec:PXD001168:20070625_03_Ti1

ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2015/04/PXD001168/20070625_03_Ti1.RAW

This seems to happen in all scans in the file.

edeutsch commented 1 year ago

So the high-level summary seems to be:

For LTQ Orbitrap type instruments, prior to TRFP v1.3.3, if there is conflicting information in the trailer and Reaction data in the RAW file, then the recorded isolation width might be incorrect in the mzML file (too small by a factor of 2 in one example)

?

caetera commented 1 year ago

I processed the file in all versions starting from 1.3.2 and can confirm that the change has been introduced in 1.3.3. I can also confirm that (at least for this specific file), an isolation width should be 2 Th. It is the one used in the stored MS method and matches the value provided in the trailer. Thus, my initial guess was likely correct. It is, however, unclear where the value stored in Reaction is coming from (Is 1 Th the fallback value?).

High-level summary The recorded isolation width can be incorrect for TRFP < 1.3.3 and older generation instruments (the ones using trailer for isolation width information). I cannot, however, say if it will be smaller or larger and to what extent.

edeutsch commented 1 year ago

Excellent, thank you for the helping me understand the issue. Closing.