Closed psychelzh closed 9 months ago
crew
uses mirai
, and the behavior you see is due to https://github.com/shikokuchuo/mirai/issues/87 (also c.f. https://github.com/shikokuchuo/mirai/issues/86#issuecomment-1846032626). Unfortunately there is nothing I can do from the perspective of crew
or targets
, but those crew
workers will exit on their own once they finish the tasks that were in progress during the interrupt.
targets
does have a callback to manually terminate its crew
workers on error:
Unfortunately, because the pipeline runs in a background callr
process, the on.exit()
callbacks are skipped when the inner process is abruptly terminated. I think counteracting this behavior would require installing non-default operating system signal handlers, which seems risky.
That said, I may reopen this issue, depending on the result of https://github.com/shikokuchuo/mirai/issues/87
From https://github.com/shikokuchuo/mirai/issues/87 and https://github.com/shikokuchuo/mirai/pull/88, it looks like there will eventually be support from mirai
via nanonext
. Reopening.
Fixed in aeeef8f19895418e2015b1227c84934409527a9c. Requires development nanonext
and mirai
for now.
If the pipeline is interrupted, worker processes now send themselves SIGKILL
to force quit abruptly. If you want a softer exit which allows cleanup/writing to complete before exiting, you can supply tools::SIGINT
to the signal
argument of crew_controller_local()
. It will take some time before I can propagate the signal
argument to launcher plugins in other packages.
If the pipeline is interrupted, worker processes now send themselves SIGKILL to force quit abruptly. If you want a softer exit which allows cleanup/writing to complete before exiting, you can supply tools::SIGINT to the signal argument of crew_controller_local(). It will take some time before I can propagate the signal argument to launcher plugins in other packages.
Rethinking that choice, c.f. https://github.com/shikokuchuo/mirai/issues/87#issuecomment-1852426711:
An update: as I was testing just now, I found out that
tools::SIGKILL
is actuallyNA
on Windows. That led me to do some more digging on signals, and I learnedSIGKILL
on a parent process can lead to zombie child processes. Zombies would defeat the whole purpose of my original preference forSIGKILL
, so I am moving away from it. Instead,crew
now prefers SIGTERM because the intention to kill the process is more explicit than SIGINT. If for some reason that signal is not defined on the user's platform, then it uses SIGQUIT. And if SIGQUIT is not defined, it uses SIGINT as a last resort. I don't think I will expose the choice of signal tocrew
users.
Thanks for all the kind considerations into this. Here confirmed that the development version works as expected perfectly on Windows, too.
Prework
crew
package itself and not a user error, known limitation, or issue from another package thatcrew
depends on. For example, if you get errors runningtar_make_clustermq()
, try isolating the problem in a reproducible example that runsclustermq
and notcrew
. And for miscellaneous troubleshooting, please post to discussions instead of issues.Description
I am using the lastest 0.7 version of crew package. After I run
targets::tar_make()
, the front-end R terminal will invoke back-end R instances. However, if I canceled the pipeline manually, these instances won't close and continue to run until they finished.Reproducible example
This pipeline is very useful to debug this problem.
Expected result
After I canceled the pipeline, all backend R instances should terminate automatically.