Closed wlandau closed 1 year ago
The above does not reproduce for me. I get "1" to "100" perfectly each time. I run R4.2.3 on Ubuntu 22.04 and I've tested multiple time using both Rstudio and from an interactive R prompt. Even tested on a low-powered Windows 10 netbook with an Intel Atom processor and it does not produce the above error. In fact I have never seen it before. Is it possible for you to come up with a more minimal example to help narrow this down? Thanks!
You're right, I cannot reproduce this on my Macbook. I think it could be something strange about my Ubuntu machine. Here is a slightly smaller version of the same reprex.
library(mirai)
library(nanonext)
daemons(n = 1L, url = "ws://127.0.0.1:5000")
launches <- 0L
pids <- integer(0L)
while (length(pids) < 100L) {
if (!exists("px") || !px$is_alive()) {
px <- callr::r_bg(\() mirai::server("ws://127.0.0.1:5000", maxtasks = 1L))
launches <- launches + 1L
}
if (!exists("m") || !.unresolved(m)) {
if (exists("m")) pids <- c(pids, m$data)
m <- mirai(ps::ps_pid())
}
}
print(launches)
print(pids)
daemons(n = 0L)
Yes the above works for me as well, all unique PIDs. Just to eliminate one simple possibility - do you still get the odd behaviour on your Ubuntu setup if you use unresolved()
instead of .unresolved()
? This could be one of those corner cases where it doesn't result in the desired behaviour (and why it isn't the main unresolved checker).
Thanks for the suggestion. I tried unresolved()
instead of .unresolved()
, and I still saw multiple instances of "Error in envir[[\".expr\"]]: subscript out of bounds"
on my Ubuntu machine.
OK, worth a try. nanonext 0.8.1 is now on CRAN, and assorted improvements in mirai 0.8.1.9004. Nothing that would address the above though.
Given I can't reproduce, I don't really want to handle this situation specifically. At the minimum we should know what provokes it.
However, in terms of behaviour - instead of sending back the error, would it be better if the server exits instead? In that case, the task will get re-sent to another server. You would get more launches, but all your results.
Hmmm I would prefer the existing error. I worry about masking the core problem as a silent efficiency issue. I wonder if another diagnostic might help along with the main error.
My personal Ubuntu machine might have something strange going on with it, and it seems unlikely for this to appear on other machines.
Also, if another user catches it in a different scenario, that's valuable info that would help us track it down. With a silent relaunch, we might miss the chance.
When you get the chance, please can you try with f8eb9a9 (v0.8.1.9005) on your ubuntu machine. Thanks.
Completely fixed on my Ubuntu machine! 100 launches and 100 unique PIDs from https://github.com/shikokuchuo/mirai/issues/43#issuecomment-1485043330. Thank you so much.
Awesome! My mistake - reasoning about the code sequentially. The truth is the NNG code at the C level is highly asynchronous, so things can complete out of order. Just needed a little extra synchronisation before continuing with the R code in the form of call_aio()
!
I have been trying to troubleshoot https://github.com/wlandau/crew/issues/51, and I ran across an issue where workers with
maxtasks = 1
sometimes return tasks showing"Error in envir[[\".expr\"]]: subscript out of bounds"
. Here is a reproducible example. I ranmirai
0.8.1.9003 withnanonext
0.8.0.9001 on R 4.2.1 on an Ubuntu machine.