sneumann / Rdisop

This is the git repository matching the Bioconductor package Rdisop: Decomposition of Isotopic Patterns
4 stars 9 forks source link

getMolecule produces wrong mass sometimes #22

Open ricoderks opened 4 years ago

ricoderks commented 4 years ago

Hi,

I using getMolecule() to calculate the exact masses of lipids and now I noticed that it sometimes proceduces the wrong mass.

Example:

> getMolecule("C73H134O17P2")$exactmass
[1] 1344.91
> getMolecule("C89H166O17P2")$exactmass
[1] 1570.163

The first one is correct, but the last one should be 1569.1600 (from LipidMaps.org)

The complete output is:

getMolecule("C89H166O17P2")
$formula
[1] "C89H166O17P2"

$score
[1] 1

$exactmass
[1] 1570.163

$charge
[1] 0

$parity
[1] "e"

$valid
[1] "Valid"

$DBE
[1] 8

$isotopes
$isotopes[[1]]
             [,1]         [,2]         [,3]         [,4]         [,5]         [,6]         [,7]         [,8]         [,9]        [,10]
[1,] 1569.1600285 1570.1634601 1571.1667309 1.572170e+03 1.573173e+03 1.574176e+03 1.575179e+03 1.576182e+03 1.577185e+03 1.578188e+03
[2,]    0.3499957    0.3574394    0.1925211 7.235706e-02 2.121983e-02 5.154573e-03 1.076076e-03 1.979199e-04 3.264850e-05 4.894935e-06

In the isotopes part it shows the correct mass!

Can you help me with this? I don't understand why this is happening.

Cheers, Rico

sneumann commented 4 years ago

I guess this is the mass of the most abundant isotope peak, which is not the monoisotopic.

That certainly would be a bug. I think there was a discussion with Corey Broeckling a while ago. Help welcome.

Yours Steffen

I blame Android for the brevity and typos

ricoderks commented 4 years ago

Hi Steffen,

I'll have a look at the code. If it is in R I might be of help. If it is c++ then it gets difficult. My c++ skill are more or less non-existing.

Cheers, Rico

ricoderks commented 4 years ago

Hi Steffen,

Tried the whole day to figure out where the monoisotopic mass was selected. I tried to understand the c++ code, but it's to complex for me to understand where this happens. Sorry, I couldn't help.

Cheers, Rico

papanlipotype commented 4 years ago

Hi Steffen I tried to compare the results with the output from a commercial software, Thermo Freestyle

The $exactmass-value is not correct, however, the $isotopes m/z values are more correct: the monoisotopic peak is fine, but the first isotope has a larger error. Also the isotope distribution seems to diffeer between the Thermo output and the Rdisop-output. Are there any updates on this issue? Thanks Cyrus

grafik

getMolecule("C89H166O17P2", maxisotopes = 4)$isotopes [[1]] [,1] [,2] [,3] [,4] [1,] 1569.1600285 1570.1634601 1571.1667309 1.572170e+03 [2,] 0.3599619 0.3676175 0.1980032 7.441743e-02

sneumann commented 4 years ago

Hi, I am afraid there is no update. There was a previous discussion in #8 and I am considering to end-of-life Rdisop, see #23 Yours, Steffen.

janlisec commented 1 month ago

@ricoderks The issue should be solved in PR #32. @papanlipotype I can confirm, that enviPat would produce a isotope abundance rather similar to the Thermo output you provided.

         mz    int
1 1569.1600 1.0000
2 1570.1634 0.9888

which is reasonable since the isotopic fine structure of the example molecule is not too complicated for the first mass isotopomer.

             m/z      abundance 31P  1H 2H 12C 13C 16O 18O 17O
[1,] 1569.160027 100.0000000000   2 166  0  89   0  17   0   0
[2,] 1570.163382  96.2599818053   2 166  0  88   1  17   0   0
[3,] 1570.164244   0.6475736039   2 166  0  89   0  16   0   1
[4,] 1570.166304   1.9092195602   2 165  1  89   0  17   0   0

I tracked the reason down to the precision of the abundances specified within Rdisop.

# test formula
fml <- "C89H166O17P2"

# elements as defined by Rdisop
ele_Rdisop <- initializeElements(c("H","C","O","P"))

# elements as defined by enviPat or CorMID (higher precision)
iso <- CorMID::isotopes
ele_enviPat <- lapply(c("H","C","O","P"), function(x) {
    m <- round(iso[iso$element==x, "mass"][1])
    c(
        list("name" = x),
        list("mass" = m),
        list("isotope" = c(
            list("mass" = iso[iso$element==x,"mass"]-m),
            list("abundance" = iso[iso$element==x,"abundance"])
        ))
    )
})

# compare the results
Rdisop::getMolecule(formula = fml, elements = ele_Rdisop, maxisotopes = 2)$isotopes
Rdisop::getMolecule(formula = fml, elements = ele_enviPat, maxisotopes = 2)$isotopes

which yields for Rdisop:

                [,1]            [,2]
[1,] 1569.1600285200 1570.1634600615
[2,]    0.4947389648    0.5052610352

an for the more precise elemental definitions:

               [,1]           [,2]
[1,] 1569.160026832 1571.163443775
[2,]    0.502975667    0.497024333

@sneumann We can either live with the small error resulting from low precision in elemental definitions or we could update to more precise values, i.e. using NIST or enviPat as a source.

Bests, Jan