Closed atheriel closed 3 years ago
This is hard to debug indeed. Somebody else reported that there may be an issue with connection pooling. Are you connecting to several collections in your workload?
Yes, we are connecting to quite a lot of them (about 360 at the moment), one after the other. The basic pattern is
m <- mongo(...)
m$insert(...)
rm(m)
I hit this issue as well on R version 3.6.1 - consistently crashing when the connection was outside the loop when it arrived at a large insert
moving the connection to mongo inside the loop and rm(m) is a workaround for the issue
@duncanhealy Is there any chance you could put together a reproducible test case?
I can put in a pull request with a test added like this? With around 900 rds files underneath a folder with some that are around ~500MB at the end of the list
r["CRAN"] = "https://cran.us.r-project.org"
options(repos = r)
install.packages('needs') ## ,repos = "http://cran.us.r-project.org")
library('needs')
needs('mongolite')
library(tools)
library('mongolite')
setwd("./")
m <- mongo(url = "mongodb://localhost/?ssl=false", options = ssl_options(weak_cert_validation = T))
files <- list.files(path="~/rdsfolder", pattern="*.rds", full.names=TRUE, recursive=TRUE)
lapply(files, function(x) {
name <- file_path_sans_ext(basename(x))
print(name)
rdsdata <- readRDS(file = x)
mtest <-mongo(name)
mtest$count()
mtest$remove('{}')
mtest$insert(rdsdata)
mtest$count()
})
I think the connection is dropped while it runs readRDS on a large file so it looks to me to be is a timeout issue on the connection pooling?! I ruled out the insert as the cause of the crash as it could be changed to a default data frame (small size) and still crash
This problem should be fixed in mongolite 2.3.0. If not, please open a new issue.
Fantastic!
We have a write-heavy workload that produces highly non-deterministic errors at the C level. For instance, this morning:
And another from the same process, about a week ago:
I have been unable to track down the source of this issue, but my suspicion at this point is that GC is running at an inopportune time in the C level, probably in the insert-related code. However,
rchk
does not report any possiblePROTECT()
issues, so that suspicion may be incorrect.(For reference, this is
mongolite
2.1.0 running on R 3.5.2 on Ubuntu.)