jaywonchung / pegasus

An SSH command runner with a focus on simplicity
MIT License
31 stars 3 forks source link

Cancelling commands #11

Open jaywonchung opened 2 years ago

jaywonchung commented 2 years ago

Cancelling commands ran by Pegasus is very difficult. You essentially have to ssh into each node and manually figure out the PIDs of commands and kill them.

Nested commands, so to say, make things more complicated. For instance, docker exec sh -c "python train.py" will run the following commands:

Only killing the fourth python train.py command will truely achieve cancellation. The bottom line is, it is difficult for Pegasus to infer how to properly terminate a command.

Potential solutions

jaywonchung commented 2 years ago

Commands ran with docker exec currently have no standard way to kill.

Following https://github.com/moby/moby/pull/41548