CDK-R / cdkr

Integrating R and the CDK
https://cdk-r.github.io/cdkr/
42 stars 27 forks source link

similarity of isotope pattern #39

Closed michaelwitting closed 3 months ago

michaelwitting commented 7 years ago

Dear rcdk-Team,

is it possible to have the IsotopePatternSimilarity function in rcdk?

http://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/formula/IsotopePatternSimilarity.html

Best regards,

Michael

rajarshi commented 7 years ago

Sure this is relatively straightforward. I've pushed a commit to master that provides a method called compare.isotope.pattern along with some helper methods. An example use would be

f1 <- get.mol2formula(parse.smiles('CCNC')[[1]])@objectJ
f2 <- get.mol2formula(parse.smiles('CCNC')[[1]])@objectJ
isoGen <- get.isotope.pattern.generator()
ip1 <- isoGen$getIsotopes(f1)
ip2 <- isoGen$getIsotopes(f2)
compare.isotope.pattern(ip1,ip2)

Currently the methods aren't documented.

Also, I don't do much mass spec/structure elucidation work, so I'm not too familiar with this part of the CDK. If you have a workflow or use case, that would help in designing how the R code should wrap around the CDK functionality

michaelwitting commented 7 years ago

Hi. Independent of two formulas I use for f1 and f2 I always get 0 as result for compare.isotope.pattern.

Here is a possible use case for the comparison. I have a measured isotopic pattern from a MS analysis. First this had to converted to the isotope pattern class of rcdk and then compared against the isotope pattern of calculated formulae.

I hope that helps??

get measured isotope pattern ([Glucose + Na]+)

isoPatternMeasured <- data.frame(mz = c(203.052609, 204.056051, 205.057227, 206.060394, 207.061845), int = c(100.000, 6.856, 1.433, 0.087, 0.009))

get monoisotopic mass

exactmass <- isoPatternMeasured$mz[1]

calculate all possible formulae

formulae <- generate.formula(exactmass, window = 0.001, elements = list(c("C",0,10),c("H",0,50), c("N",0,5),c("O",0,50), c("Na",0,1)), validation = T, charge = 1)

isoPatternCalculated <- get.isotopes.pattern(formulae[[1]], minAbund = 0.001)