rformassspectrometry / MetaboCoreUtils

Core utilities for metabolomics.
https://rformassspectrometry.github.io/MetaboCoreUtils/index.html
9 stars 6 forks source link

Formula functions and adduct formula generation #12

Open michaelwitting opened 4 years ago

michaelwitting commented 4 years ago

Add functions to

michaelwitting commented 4 years ago

Where to put? chemformula.R?

jorainer commented 4 years ago

yep, that sounds reasonable.

michaelwitting commented 4 years ago

What about a function that generates an ion formula from a neutral formula and an adduct definition, e.g. C6H12O6 and [M+Na]+ makes C6H12O6Na or [C6H12O6Na]1+.

jorainer commented 4 years ago

That would be great! And we can then also calculate the mass (or m/z) for that (?)

michaelwitting commented 4 years ago

In theory yes. I'm thinking here in the direction DB creation. Since we have the tools to calculate m/z from neutral mass (mass2mz), the question is if we really need it. What I would like to do with functions finally is to create a DB with name, smiles, inchi, inchikey, neutral formula, neutral mass, adduct, mz, ion formula. This shall always start from name, smiles, inchi, inchikey, neutral formula. So we can have function to calculate mass from formula. Therefore we would need also a data.frame with all atomic masses + mass of an electron.

michaelwitting commented 4 years ago

We could use this .xml file with all elements and isotopes and parse it into a data.frame:

https://github.com/BlueObelisk/bodr/blob/master/bodr/isotopes/isotopes.xml

Also includes relative abundance which will become important for isotope pattern calculation.

michaelwitting commented 4 years ago

working on a script to create an element table. Prototype should be ready soonish...

sgibb commented 4 years ago

While the unimod package is still not usable it offers a data.frame fetched from unimod.org with some elements (not sure if that is enough):

devtools::install_github("rformassspectrometry/unimod")
library("unimod")
elements
#     Name    FullName    AvgMass   MonoMass
# H      H    Hydrogen   1.007940   1.007825
# 2H    2H   Deuterium   2.014102   2.014102
# Li    Li     Lithium   6.941000   7.016003
# C      C      Carbon  12.010700  12.000000
# 13C  13C    Carbon13  13.003355  13.003355
# N      N    Nitrogen  14.006700  14.003074
# 15N  15N  Nitrogen15  15.000109  15.000109
# O      O      Oxygen  15.999400  15.994915
# 18O  18O    Oxygen18  17.999160  17.999160
# F      F    Fluorine  18.998403  18.998403
# Na    Na      Sodium  22.989770  22.989768
# P      P Phosphorous  30.973761  30.973762
# S      S      Sulfur  32.065000  31.972071
# Cl    Cl    Chlorine  35.453000  34.968853
# K      K   Potassium  39.098300  38.963707
# Ca    Ca     Calcium  40.078000  39.962591
# Fe    Fe        Iron  55.845000  55.934939
# Ni    Ni      Nickel  58.693400  57.935346
# Zn    Zn        Zinc  65.409000  63.929145
# Se    Se    Selenium  78.960000  79.916520
# Br    Br     Bromine  79.904000  78.918336
# Ag    Ag      Silver 107.868200 106.905092
# Hg    Hg     Mercury 200.590000 201.970617
# Au    Au        Gold 196.966550 196.966543
# I      I      Iodine 126.904470 126.904473
# Mo    Mo  Molybdenum  95.940000  97.905407
# Cu    Cu      Copper  63.546000  62.929599
# e      e    electron   0.000549   0.000549
# B      B       Boron  10.811000  11.009305
# As    As     Arsenic  74.921594  74.921594
# Cd    Cd     Cadmium 112.411000 113.903357
# Cr    Cr    Chromium  51.996100  51.940510
# Co    Co      Cobalt  58.933195  58.933198
# Mn    Mn   Manganese  54.938045  54.938047
# Mg    Mg   Magnesium  24.305000  23.985042
# Pd    Pd   Palladium 106.420000 105.903478
# Al    Al   Aluminium  26.981539  26.981539
jorainer commented 2 years ago

Pinging @RogerGinBer - interested in providing a PR for this?

RogerGinBer commented 2 years ago

Sure! I've written something very similar for my package RHermes (we use ionic formulas all the time), so it should be easy to adapt it :+1:

jorainer commented 2 years ago

Is there something left to develop in this issue @michaelwitting ? @RogerGinBer has added the adductFormula function and we have also a containsElements function that allows to check if one formula is contained in another.