Meredith-Lab / volcalc

volcalc: Calculate Volatility of Chemical Compounds
https://meredith-lab.github.io/volcalc/
Other
5 stars 1 forks source link

Rings are over-counted #57

Closed Aariq closed 1 year ago

Aariq commented 1 year ago

The ChemmineR::rings() function counts all possible rings by default. E.g, for caffeine (pictured below), it counts 3 rings. The 5-membered ring, the 6-membered ring, and the ring you'd get if you traced an outline around both! volcalc should use the inner = TRUE option to only count the smallest possible rings—it currently uses the default inner = FALSE in get_fx_groups() so calc_vol() is overestimating the number of rings for any compound with joined rings.

library(ChemmineR)
#caffeine SMILES
caf <- ChemmineR::smiles2sdf("CN1C=NC2=C1C(=O)N(C(=O)N2C)C")
rings(caf)
#> $ring1
#> [1] "C_5" "N_4" "C_3" "N_2" "C_6"
#> 
#> $ring2
#> [1] "C_7"  "N_9"  "C_10" "N_12" "C_5"  "C_6" 
#> 
#> $ring3
#> [1] "C_7"  "N_9"  "C_10" "N_12" "C_5"  "N_4"  "C_3"  "N_2"  "C_6"
rings(caf, inner = TRUE)
#> $ring1
#> [1] "C_5" "N_4" "C_3" "N_2" "C_6"
#> 
#> $ring2
#> [1] "C_7"  "N_9"  "C_10" "N_12" "C_5"  "C_6"

Created on 2023-07-27 with reprex v2.0.2

Aariq commented 1 year ago

This could make a really big difference for RVI with compounds like triterpenoids that have multiple rings. Solanine, for example, has 24 rings with inner = TRUE and 9 with inner = FALSE

KristinaRiemer commented 1 year ago

Oh wow, good catch! The ring count always seemed like a mess, hopefully this improves it.

Aariq commented 1 year ago

More info: https://depth-first.com/articles/2020/08/31/a-smallest-set-of-smallest-rings/