rformassspectrometry / CompoundDb

Creating and using (chemical) compound databases
https://rformassspectrometry.github.io/CompoundDb/index.html
17 stars 16 forks source link

Modified lipidblast importer #9

Closed jorainer closed 7 years ago

jorainer commented 7 years ago

I've modified the lipid blast importer to extract also the formula and the mass from the json. This does not require rcdk.

jorainer commented 7 years ago

Code is here: https://github.com/EuracBiomedicalResearch/CompoundDb/blob/master/R/createCompDbPackage.R#L182:L214

Compared to the PeakABro implementation we can achieve a considerable performance increase:

js <- "~/Projects/git/EuracBiomedicalResearch/CompoundDb/local_data/MoNA-export-LipidBlast.json"

system.time(
    cmps_1 <- PeakABro::generate_lipidblast_tbl(js)
)
##    user  system elapsed 
## 653.875  11.701 648.195 

system.time(
    cmps_2 <- CompoundDb:::.import_lipidblast(js)
)
 ##  user  system elapsed 
 ## 49.037   2.500  51.569 

> all.equal(cmps_1$id, cmps_2$id)         # OK
[1] TRUE
> all.equal(cmps_1$inchi, cmps_2$inchi)   # OK
[1] TRUE
> all.equal(cmps_1$formula, cmps_2$formula) # OK
[1] TRUE
> all.equal(cmps_1$mass, cmps_2$mass)       # OK
[1] TRUE
stanstrup commented 7 years ago

Hmpf. I really wonder how I missed that the info was there.

jorainer commented 7 years ago

Wasn't really that obvious. The info in the json is a little tricky to untangle.

jorainer commented 7 years ago

Good thing is we remove dependency from rcdk (for now). was a little tricky to get that to work on recent macOS.