Closed nbehrnd closed 3 months ago
Thanks for your interest in volcalc
! First off, I don't have control over the builds at r-universe, and I believe it only builds binaries for the release version of R (which is 4.4, I think). You may have to install from GitHub directly. ChemmineOB, etc. are not on CRAN, but rather on Bioconductor, which is why the dependencies aren't found I think. I'd recommend the pak
package for package installation, as it takes care of system dependencies on linux automatically and can install bioconductor packages. So, you'd do pak::pak("Meredith-Lab/volcalc")
after installing pak
. If you want to try the r-universe version first, you'd use
options(repos = c("https://cct-datascience.r-universe.dev", getOption("repos")))
pak::pak("volcalc")
Addition: beside wanting to use volcalc per se to estimate volatilities, I equally want to retrieve the molecules (preferably as SMILES string, though a .sdf or .mol would be fair, too)
If you have KEGG IDs and you'd like to get .mol files, you can use volcalc::get_mol_kegg()
. If you'd like to get SMILES, you can either translate those .mol files into SMILES strings with ChemmineR
. Here's an example
library(volcalc) #just for get_mol_kegg
library(ChemmineR)
library(purrr)
#get some example file paths to mol files
mols <- get_mol_kegg(pathway_ids = "map00253", dir = tempdir())
paths <- mols$mol_path
smiles_df <-
paths |>
map(ChemmineR::read.SDFset) |>
map(ChemmineR::sdf2smiles) |>
map(function (.x) data.frame(name = names(as.character(.x)), smiles = as.character(.x))) |>
list_rbind()
head(smiles_df)
#> name
#> Malonyl-CoA Malonyl-CoA
#> Malonyl-[acyl-carrier protein] Malonyl-[acyl-carrier protein]
#> Anhydrotetracycline Anhydrotetracycline
#> 5a,11a-Dehydrotetracycline 5a,11a-Dehydrotetracycline
#> Tetracycline Tetracycline
#> Chlortetracycline Chlortetracycline
#> smiles
#> Malonyl-CoA C(C(=O)SCCNC(=O)CCNC(=O)[C@@H](C(COP(=O)(O)OP(=O)(OC[C@@H]1[C@H]([C@H]([C@@H](O1)n1c2c(nc1)c(ncn2)N)O)OP(=O)(O)O)O)(C)C)O)C(=O)O
#> Malonyl-[acyl-carrier protein] C(=O)(CC(=O)O)S*
#> Anhydrotetracycline [C@H]12[C@](C(=O)c3c(C1)c(c1c(c3O)c(ccc1)O)C)(C(=O)C(=C([C@H]2N(C)C)O)C(=O)N)O
#> 5a,11a-Dehydrotetracycline [C@H]12[C@](C(=O)C3=C(C1)[C@](c1c(C3=O)c(ccc1)O)(C)O)(C(=O)C(=C([C@H]2N(C)C)O)C(=O)N)O
#> Tetracycline [C@H]12[C@](C(=C3[C@H](C1)[C@](c1c(C3=O)c(ccc1)O)(C)O)O)(C(=O)C(=C([C@H]2N(C)C)O)C(=O)N)O
#> Chlortetracycline c1cc(c2c(c1Cl)[C@]([C@@H]1C(=C([C@]3([C@@H](C1)[C@@H](C(=C(C3=O)C(=O)N)O)N(C)C)O)O)C2=O)(O)C)O
write.csv(smiles_df, "smiles.csv")
Created on 2024-07-30 with reprex v2.1.0
For working with a couple of KeggIDs, the approach was replicated successfully. Thanks.
In an instance of Linux Xubuntu 22.04 LTS/Jammy, which by default provides R 4.1.2 (package tracker), the installation of volcalc fails (cf. log to the CLI attached in the .zip archive below). Contrary to my anticipation, it seems the installation fails to resolve automatically its dependencies of ChemmineR` and alike.
New to the ecosystem of R, I understand tags like
r-4.3-any
,r-4.4
, orr-4.5
on your description here as if your work targets versions of R more recent, or even ones on could consider as «testing». These then possibly bar earlier versions of R of being used. May you soften the restrain and provide a version suitable for an Ubuntu which is a little bit old, but not that old (in comparison to ?buntu's publication of point release 24.04.1/Noble Numbat scheduled for August 15th).2024-07-30_volcalc_xubuntu2204.txt.zip
Addition: beside wanting to use volcalc per se to estimate volatilities, I equally want to retrieve the molecules (preferably as SMILES string, though a .sdf or .mol would be fair, too) of Automating methods for estimating metabolite volatility, SI file
Data Sheet1.csv
which reportsbut no machine readable structure representation.