Closed tarakc02 closed 7 years ago
Might just be a coincidence, but the code snippet above consistently gives the infinite recursion error if I restart the R session before running it. Otherwise, it's hard to get. Also, I can't get the error if I use period = .01
or period = .1
instead of period = .05
.
Hmmn.
Still trying to get to the bottom of things, but an observation that is helping:
microbenchmark::microbenchmark(
Sys.sleep(.01)
)
When run 100 times, around 8-15 of the runs take less time than .01 seconds. The too-short runs take at most 9998925 nanoseconds (0.009998925 seconds), so are only off by 1.075E-6 seconds. I don't believe my machine knows how to sleep for such a short time:
microbenchmark(Sys.sleep(1.075e-06))
Hits 0 nanoseconds about half the time. And so, again with my earlier hypothesis: the wait
function is trying to "sleep" for 1E-6 seconds, instead sleeps for 0 seconds, and then calls request
again, which results again in wait
trying to sleep for < 1E6 seconds, and so on. Will try to put together a minimal example.
One idea I had was to change the sleep line in wait
from Sys.sleep(exception$wait_time)
to, say, Sys.sleep(exception$wait_time * 1.05)
. This has the effect of slowing everything down, even when there was no problem, and doesn't necessarily fix the issue (it's hard to know what kind of precision to expect from Sys.sleep
, and if the program can get itself into that position again, 1.05 * 1E-6 is still too small.
By doing some testing on multiple machines, perhaps I can get the smallest length of time that Sys.sleep
can reliably sleep for and add that to every wait time.
Woo! Here is a minimally reproducible example. On my Windows machine, in any case, it results in an infinite recursion error every time:
f <- function(t = proc.time()[["elapsed"]]) {
force(t)
Sys.sleep(.0001)
if (proc.time()[["elapsed"]] - t < .0001)
return(f(proc.time()[["elapsed"]]))
TRUE
}
So what to do? The goal is to make sure the system sleeps for a long enough time the first time around, to avoid these recursive calls (I'd prefer not to have wait
return TRUE
, as it is clear that Sys.sleep(x)
can take < x seconds to run). Perhaps adding in a "buffer" (which was in earlier versions of the pakage!) to the wait time, s.t. I can guarantee (err, that word feels too ambitious) that:
system.time(Sys.sleep(x + buffer)) > x
This would, I believe, have the added benefit of fixing #13.
The smallest acceptable amount for buffer
will be system dependent, though (I think?). I guess it's probably best to be conservative. Original version of the package had buffer = 1/precision
.
Alternatively, just code in a hard minimum. So instead of Sys.sleep(exception$wait_time)
, perhaps using:
if (exception$wait_time <= MIN_SLEEP)
Sys.sleep(MIN_SLEEP)
else Sys.sleep(exception$wait_time)
For example:
Doesn't always result in an error, but occasionally does.
I'm pretty sure this happens because the
wait()
function sleeps for the amount of time that was calculated as necessary, but then does not return TRUE, instead it re-requests a token. The reason for this design is to ensure that the rate limit is not accidentally overrun (becauseSys.sleep
is not super duper precise), but I guess (just a working hypothesis right now, it's hard to get the behavior to replicate consistently) it can get trapped into callingrequest
over and over until the call stack is full.