OpenMS / OpenMS

The codebase of the OpenMS project
https://www.openms.de
Other
479 stars 318 forks source link

NASequence outputs wrong masses / molecular formulas for w and x ions when using phosphorothiates #6687

Closed Paeger closed 1 year ago

Paeger commented 1 year ago

Using pyopenms_nightly-3.0.0.dev20230125 on Windows 10

Code to reproduce error:

oligo_mod = NASequence.fromString("[dA*][dA*][dA]")
seq_formula = oligo_mod.getFormula()

print("RNA Oligo", oligo_mod, "has molecular formula",
  seq_formula, "and length", oligo_mod.size())

print("-"*35)
print(f"Mono isotopic weight: {oligo_mod.getMonoWeight()}")
print(f"Average weight: {oligo_mod.getAverageWeight()}")

suffix = oligo_mod.getSuffix(2)
charge = -1

w2_mass = suffix.getMonoWeight(NASequence.NASFragmentType.WIon, charge)
w2_formula = suffix.getFormula(NASequence.NASFragmentType.WIon, charge)
mz = w2_mass / charge
print("w2- ion", suffix, "has mz", -mz)
print("w2- ion", suffix, "has molecular formula", w2_formula)

x2_mass = suffix.getMonoWeight(NASequence.NASFragmentType.XIon, charge)
x2_formula = suffix.getFormula(NASequence.NASFragmentType.XIon, charge)
mz_x = x2_mass / charge
print("x2- ion", suffix, "has mz", -mz_x)
print("x2- ion", suffix, "has molecular formula", x2_formula)

y2_mass = suffix.getMonoWeight(NASequence.NASFragmentType.YIon, charge)
y2_formula = suffix.getFormula(NASequence.NASFragmentType.YIon, charge)
mz_y = y2_mass / charge
print("y2- ion", suffix, "has mz", -mz_y)
print("y2- ion", suffix, "has molecular formula", y2_formula)

which outputs: RNA Oligo [dA][dA][dA] has molecular formula C30H37N15O11P2S2 and length 3 Mono isotopic weight: 909.1713656203 Average weight: 909.7901821802013 w2- ion [dA][dA] has mz 659.0956580874043 ----> should be 675.07 w2- ion [dA][dA] has molecular formula C20H25N10O10P2S1 ----> Missing 1x sulfur, 1x oxygen too many) x2- ion [dA][dA] has mz 641.0850930236044 ----> should be 657.06 x2- ion [dA][dA] has molecular formula C20H23N10O9P2S1 ----> Missing 1x sulfur, 1x oxygen too many) y2- ion [dA][dA] has mz 579.1293265655044 ----> Correct y2- ion [dA][dA] has molecular formula C20H24N10O7P1S1 ----> Correct

z- ion is also correct...

poshul commented 1 year ago

Hi @Paeger I will take a look at the tomorrow and see where the issue is coming from. It's too late in the day here for tracing down that chemistry tonight.

poshul commented 1 year ago

Okay. I've identified the issue, there was a problem with how we handled thiols that were at the 5' end of NASequences where we used getSuffix. I'm putting together a fix now.

poshul commented 1 year ago

@Paeger this is fixed in the latest nightly. Thanks again for finding it, and the detailed write-up.

Paeger commented 1 year ago

Hi @poshul, many thanks for the quick fix. I will give it a go tomorrow.

Paeger commented 1 year ago

@poshul , w and x ions are now correct. I realized though, that the mass of (a-B) ions are wrong. I will open a new issue.