Closed Boris-Droz closed 4 years ago
@Boris-Droz Instead of looking into connections life cycle... I have a generic, creative solution for such problems: use callr to do the conversion in chunks in separate R processes:
library("Rcpi")
library("callr")
dir.create("test")
for (i in 1:2000) file.copy(system.file("compseq/DB00530.sdf", package = "Rcpi"), paste0("test/", i, ".sdf"))
fns <- list.files("test/", pattern = ".sdf$", full.names = TRUE)
convert <- function (fns, idx) {
callr::r(function (fns, idx) {
smiles <- c()
for (i in idx) {
Rcpi::convMolFormat(infile = fns[i], outfile = "temp.smi", from = "sdf", to = "smiles")
smiles <- c(smiles, Rcpi::readMolFromSmi(smifile = "temp.smi", type = "text")[1])
}
smiles
}, args = list(fns, idx))
}
k <- length(fns)
chunks <- split(1:k, ceiling(seq_along(1:k)/400))
smi <- rep(NA, k)
for (i in 1:length(chunks)) smi[chunks[[i]]] <- convert(fns, chunks[[i]])
smi
Nice, thank you very much for this prompt answer. Problem resolved.
Hello, I used many time the convMolFormat function with great success. Thank you again for this useful package. Right now, I am using it with a bench of input (10000 mol file) coming from a commercial predictive in-silico tool from Bruker. I wanted to generate a smile table to match the smile for further comparison with other data. However, in some point (after 520 loops), I get the message "Too many open files". So I tried the common advice given in some forum which is closeAllConnections(). It seams that is not where came from the problem. I check with showConnections(all=TRUE) and only 0,1,2 which are standard connections are open.
I will really appreciate any idea to debug this.
Below the dummy code to see the problem if necessary
Thank you very much
Boris