Open asieira opened 10 years ago
BTW, I saw references on the documentation to a function called curlGlobalCleanup
but that simply is not found by R when I try to execute it.
Also, important to mention I am running R 3.0.3 64 bits on Mac OS X 10.9.2. sessionInfo
tells me I am running RCurl_1.95-4.1
.
The handle and multi handle objects seem to be missing finalizers (as per http://cran.r-project.org/doc/manuals/R-exts.html#External-pointers-and-weak-references) that call the libcurl cleanup functions to release resources.
Found a workaround. If I use forbid.reuse=T
in the call to getURLAsynchronous
the problem does not happen.
This however confirms that proper cleanup of the handles (which would cause the connections to be closed) would solve it in the general case and lead to optimal performance.
I am running a large number (32k) number of GET requests using getURLAsynchronous, 120 URLs at a time. This is what the call looks like within the loop:
After this batch is executed,
showConnections
shows me no entries other than stdin, stdout and stderr.Still, when I try to use mclapply to process all the combined responses in parallel I get the following error:
If I run the same mclapply without calling getURLAsynchronous first, with test data of the same size, no such error happens.
I tried searching the RCurl documentation for any steps I could include in my code to close connections and/or release resources, and was unable to find any. So I'm assuming that would happen when the handles are garbage collected, hence the explicit call to gc() after each step.
Am I missing something here?