Open IgnacioJPickering opened 3 years ago
Based upon the scale of the m/z deltas your reporting, what you're seeing is isotopic fine structure, where the delta m/z between peaks is the difference between the masses of neutrons of different elements rather than differences between number of extra neutrons.
brainpy
implements the BRAIN algorithm which collapses fine structure, as described in “BRAIN: a universal tool for high-throughput calculations of the isotopic distribution for mass spectrometry.,” Anal. Chem., vol. 85, no. 4, pp. 1991–4, Feb. 2013.
If you want a tool to generate fine structure, check out https://github.com/MatteoLacki/IsoSpec which has Python bindings at IsoSpecPy
already available on the Python package index.
Isotopic fine structure is only visible above a certain resolution, and will depend upon your instrument. Do you know what the settings on your instrument (detector type, resolution) and whether you see larger isotopologues or isotopomers?
Thank you for the prompt response!
I'm not exactly sure what you mean by "difference in masses of neutrons", as I said I'm not knowledgable about this topic, but my guess is that the deviation I'm seeing, which is about ~ 7.5 m/z is a bit too large. I don't really care about anything after the first 3-4 sig figs (i.e. I care if the number is 924 m/z or 931 m/z but nothing else) I don't really need that level of accuracy, my issue is that brainpy predicts about ~ 931 m/z as the largest peak, but even a naive calculation of the m/z by using the masses of the most abundant isotopes of each element gives a value between 922-924 m/z in this case.
I knew BRAIN implemented an approximation but I didn't know it would be possible for it to fail to this extent (~ 7.5 m/z or 1% of the m/z value), but maybe I'm misunderstanding the use case of this algorithm, I'm guessing this makes much more sense being designed for much larger biomolecules / proteins, and not so much for small organic molecules such as the one I'm trying to predict.
I used the package since the example shown in the brainpy's README plots a MS spectra of a relatively small organic molecule at approximately this scale, (800 - 805 m/z) so I recommend some sort of disclaimer is added to the documentation if these differences are to be expected.
Thanks a lot for the IsoSpec recommendation! I will definitely check that out, and thank you again for helping with this issue.
Also, I wanted to add that the difference of m/z between peaks is pretty accurate, the issue is that all peaks seem to be displaced by ~ 7 m/z, which lead my further to believe that this is some sort of bug that is displacing values by ~ 7 in this case, instead of an artifact of some approximation in the algorithm, which I would guess would probably give very small random deviations for each peak, but again, maybe this is just a consequence of this "fine structure collapse" that I'm not fully grasping.
I attach for comparison the values predicted by the applet using a low resolution spectra, as you can see even in this case the maximum peak is ~ 925 m/z, pretty far from brainpy's prediction of 931
917 0.9
918 0.4
919 9.9
920 12.3
921 39.6
922 28.7
923 92.5
924 49.2
925 100
926 47.9
927 54.3
928 23.8
929 15.3
930 5.8
931 2
932 0.6
933 0.1
Thank you for reporting the issue, and then clarifying it when I dismissed it in haste.
I was paying too much attention to that second peak list thinking you were talking about distance between peaks. Now I understand it's that the theoretical most abundant peak is wrong. I think the issue is related to Selenium, not an element I've dealt with at this level of fundamentals.
Visualizing the pattern for the composition you're using:
Selenium's isotopes are weird:
In [11]: Se = brainpy.periodic_table['Se']
In [12]: Se.isotopes
Out[12]:
OrderedDict([(-6,
Isotope(abundance=0.0089, neutrons=74, mass=73.9224764, neutron_shift=-6)),
(-4,
Isotope(abundance=0.0937, neutrons=76, mass=75.9192136, neutron_shift=-4)),
(-3,
Isotope(abundance=0.0763, neutrons=77, mass=76.919914, neutron_shift=-3)),
(-2,
Isotope(abundance=0.2377, neutrons=78, mass=77.9173091, neutron_shift=-2)),
(0,
Isotope(abundance=0.4961, neutrons=80, mass=79.9165213, neutron_shift=0)),
(2,
Isotope(abundance=0.0873, neutrons=82, mass=81.9166994, neutron_shift=2))])
If I try to simplify the problem, I can isolate it:
In [33]: brainpy.isotopic_variants({"Se": 1}, npeaks=10)
Out[33]:
[Peak(mz=79.916521, intensity=0.086745, charge=0),
Peak(mz=82.075165, intensity=0.913255, charge=0)]
Note how the minimal mass peak (the "monoisotopic" peak) has a mass equal to the true monoisotopic mass for Se[78], but has the abundance of Se[74], while
I suspect that there is an error in the logic somewhere under https://github.com/mobiusklein/brainpy/blob/master/brainpy/_c/isotopic_constants.pyx#L274, probably in https://github.com/mobiusklein/brainpy/blob/master/brainpy/_c/isotopic_constants.pyx#L102 where it assumes that the lowest isotope is 0 or at least non-negative, which is not the case for Se.
Unfortunately, I have a pretty full log of work right now. If IsoSpec can do the job for you, I'd rather you use that than wait while I try to fix this now, as it'd take me a while just to get to this.
Ahh, I see what the issue may be! I think you narrowed it down pretty well. IsoSpec does the job for me right now, so I've switched to that, I'm also pretty busy but in the future if you can't find time to fix this I may do a PR, since I like the documentation/API of brainpy a bit more.
Hi both,
I know I'm about 2 years late on this, but wanted to say I really appreciate this package and that I think I have a resolution to this. I seemed to notice this behavior with iron (in heme) and found that there were a couple things to resolve (in the pure pythonic):
max_n_variants += count * (periodic_table[element].max_neutron_shift() - periodic_table[element].min_neutron_shift())
low_mass = self._isotopic_constants[element].element.min_neutron_shift() center += self.composition[element] * sign * ele_sym_poly[i] *\ self.monoisotopic_peak.intensity * self._isotopic_constants[element].element.isotopes[low_mass].mass
This should allow you to undo the comment for Fe if I'm not mistaken.
Anyway, thanks again for sharing this, hope this captures your intent.
Thank you for the suggested solution. Is this the region you are referring to modifying for the second point?
https://github.com/mobiusklein/brainpy/blob/master/src/brainpy/brainpy.py#L576-L583
for element, ele_sym_poly in ele_sym_poly_map.items():
center += self.composition[element] * sign * ele_sym_poly[i] *\
self.monoisotopic_peak.intensity * self._isotopic_constants[element].element.monoisotopic_mass()
Yes! That's correct. Apologies for the delay and for not linking the code directly.
I'm not a mass spec person, but I've recently had to analyza a compound with formula that has multiple copper and sulphur atoms, and brainpy gives me pretty large deviations from the experimental spectra; in particular, I get a ~ 7.5 m/z deviation for all peaks in the spectrum. These deviations are not observed in other spectra MS prediction software, such as the online applet https://www.sisweb.com/mstools/isotope.htm In particular, for a molecular formula H60C46Cu3Se1N3 calling brainpy with:
Produces as a result
It seems to me that brainpy is predicting 931.64 to be the m/z with the maximum intensity, but https://www.sisweb.com/mstools/isotope.htm using high resolution, H60C46Cu3Se1N3 and minimum abundance of 0.01% predicts
(note that these are only some example peaks which contain the max value) which means 924.1874 is the m/z with maximum intensity, and these results agree 100% with the experimental spectrum, which makes me suspect a bug in brainpy.