ropensci / webchem

Chemical Information from the Web
https://docs.ropensci.org/webchem
Other
161 stars 40 forks source link

parse_mol() no longer used in package and doesn't work with standard molfiles #294

Closed Aariq closed 2 years ago

Aariq commented 4 years ago

When I was messing around with CHEMBL (#109) I realized that the parse_mol() file in utils.R isn't actually used internally anymore. Also, I realized that it doesn't seem to work with standard mol files.

library(webchem)
library(httr)
u <- "https://www.ebi.ac.uk/chembl/api/utils/smiles2ctab"
res <- POST(u,
            body = "CC(O)=O",
            httr::user_agent(webchem:::webchem_url())
)
out <- rawToChar(res$content)
parse_mol(out)
#> Error in names(bb) <- c("1", "2", "t", "s", "x", "r", "c"): 'names' attribute [7] must be the same length as the vector [4]

Created on 2020-09-25 by the reprex package (v0.3.0)

More info on file structure here: https://en.wikipedia.org/wiki/Chemical_table_file

stitam commented 4 years ago

Thanks @Aariq. parse_mol() is referenced in chemspider.R and it seems to work with the output of cs_convert(). Well, sometimes. I know inchi->mol and inchikey->mol can have different outputs, sometimes these can be parsed, sometimes they can't. It seems when the string starts with ACD/Labs then parse_mol() can handle it but when it starts with OpenBabel it can't. I haven't investigated this further, could be a regex issue?

Aariq commented 2 years ago

This was fixed in PR #320, right?

stitam commented 2 years ago

Thanks @Aariq, I think this issue is now fixed.