shikokuchuo / mirai

mirai - Minimalist Async Evaluation Framework for R
https://shikokuchuo.net/mirai/
GNU General Public License v3.0
193 stars 10 forks source link

Return the PID from launch() #52

Closed wlandau closed 1 year ago

wlandau commented 1 year ago

As we discussed before, it would be nice if launch() returned the PID of the server process it launches. That way, I can replace callr::r_bg() with mirai::launch() in crew, which would allow me to drop a package dependency.

By the way, do you know if NNG has a replacement for getip::getip(type = "local")? If I could drop the getip package too, then crew would be just about as light as possible.

shikokuchuo commented 1 year ago

As we discussed before, it would be nice if launch() returned the PID of the server process it launches. That way, I can replace callr::r_bg() with mirai::launch() in crew, which would allow me to drop a package dependency.

Done in 8335e57 (v0.8.2.9013) on a roll today!

By the way, do you know if NNG has a replacement for getip::getip(type = "local")? If I could drop the getip package too, then crew would be just about as light as possible.

This is not in NNG. The trouble lies in catering to different platforms or I would just implement in nanonext...

shikokuchuo commented 1 year ago

Just a note - you will need to test the return value with is_error_value() as the PID is also an integer.

wlandau commented 1 year ago

Awesome!

on a roll today!

Indeed you are!

This implementation of launch() almost works for me, the only hitch is that crew actually runs its own wrapper around mirai:server(): https://github.com/wlandau/crew/blob/main/R/crew_worker.R. It doesn't do very much now that we solved https://github.com/wlandau/crew/issues/61, but it does set environment variables that allow tasks to report the exact workers and worker instances that run them. Would it be possible to accept an arbitrary piece of R code in launch()? crew already creates it with the call() method of the launcher as shown in https://wlandau.github.io/crew/articles/launcher_plugins.html#example.

shikokuchuo commented 1 year ago

So the call() method creates the entire call as a text string? That's fine... but you will need to supply a valid URL to the auth argument of server(). Are you fine to do that your end? Otherwise I will need access to the string and modify it.

Please just confirm the exact format as I will need to add the auth param on my side.

wlandau commented 1 year ago

So the call() method creates the entire call as a text string?

Exactly. I feel this is the best way to interface with e.g. SLURM and other platforms outside of R for third-party launchers.

Please just confirm the exact format as I will need to add the auth param on my side.

I worry that depending on particular features of launcher$call() might be brittle if I need to change the output format. What about exposing the env argument of system2()? If I have that, then I can supply the correct arguments to the existing launch(), plus the environment variables I need to set.

shikokuchuo commented 1 year ago

Does it make more sense to integrate into crew directly? I've provided the interface for getting the PID through specifying the auth argument. The server will send the PID over a bus socket to the URL provided by auth.

You'd just need this part of the launch() code:

auth <- sprintf(.urlfmt, new_token())
sock <- socket(protocol = "bus", listen = auth)
r <- recv(sock, mode = 5L, block = 2000L)
close(sock)
r
shikokuchuo commented 1 year ago

Actually that feels more brittle to future changes. Do I just need to add an argument env to map to the system2 call?

wlandau commented 1 year ago

That will cover it, just an env argument for system2() has everything I need for the extra stuff in crew::crew_worker().

shikokuchuo commented 1 year ago

OK, done in c3c6ac0. v0.8.2.9014.

shikokuchuo commented 1 year ago

Launch is currently synchronous i.e. waits for the PID to be received. I think it would be better to do an asynchronous receive. Then the recvAio will resolve to the PID or a timeout error after 2s.

Actually synchronous is cleaner if you can live with it. Async poses the question of disposing of the socket once done etc.

wlandau commented 1 year ago

OK, done in c3c6ac0. v0.8.2.9014.

Thanks! I have updated crew to use mirai::launch() instead of callr::r_bg(). This eliminates the dependency on callr (although processx is still a Suggests: package for testing and vignettes).

Actually synchronous is cleaner if you can live with it. Async poses the question of disposing of the socket once done etc.

Synchronous is totally fine. I think anyone who uses it in targets pipelines will want persistent workers anyway, so a little bit of blocking startup time is not an issue.

wlandau commented 1 year ago

Hmmm... somehow I missed this from the system2() documentation:

On Windows, env is only supported for commands such as R and make which accept environment variables on their command line.

Indeed, when I set env in launch() on Windows, I get 'errorValue' int 5 | Timed out.

As a stopgap for now, I plan to revert crew to directly call processx (I don't think I need all of callr). I already use processx for vignettes and testing, and even a little bit of robustness is worth the extra package dependency. In the meantime, if you happen to find another way to integrate launch(), please let me know.

shikokuchuo commented 1 year ago

Can we just use R instead of Rscript then for Windows?

wlandau commented 1 year ago

Would that help? Are you thinking about an alternative to system2()? Sorry, I don't think I follow entirely.

wlandau commented 1 year ago

In the meantime, if you happen to find another way to integrate launch(), please let me know.

On reflection, I actually think I prefer processx regardless. During the process of coding with mirai::launch(), I noticed ps::ps_kill() throws an error if the process already exited. I ended up wrapping it in try(), but that does not seem very clean. The processx kill() method is silent if the process already terminated, which seems a bit more robust and worth the dependency on processx. Sorry I did not realize this before, it could have saved you some work.

shikokuchuo commented 1 year ago

In the meantime, if you happen to find another way to integrate launch(), please let me know.

On reflection, I actually think I prefer processx regardless. During the process of coding with mirai::launch(), I noticed ps::ps_kill() throws an error if the process already exited. I ended up wrapping it in try(), but that does not seem very clean. The processx kill() method is silent if the process already terminated, which seems a bit more robust and worth the dependency on processx. Sorry I did not realize this before, it could have saved you some work.

I was always under the impression that callr was a wrapper of processx, which was a wrapper of ps... Just had a quick look at the processx code (just the R side) - tons of calling handlers... your one tryCatch() is going to be a lot cleaner I'd think!

But I think processx is a good quality dependency, so could work well for crew. Then I can revert launch() to a simpler form without getting the PIDs.

wlandau commented 1 year ago

Sounds like a plan!