con / duct

A helper to run a command, capture stdout/stderr and details about running
https://pypi.org/project/con-duct/
MIT License
1 stars 2 forks source link

reports negative exit code -- should we treat as a feature or "fix up" #148

Closed yarikoptic closed 4 days ago

yarikoptic commented 3 weeks ago
❯ duct strace -t -f -o .tmp/`mdate`-nvidia-smi-strace-1.log nvidia-smi
...
Exit Code: -9

whenever

❯ strace -t -f -o .tmp/`mdate`-nvidia-smi-strace-2.log nvidia-smi ; echo $?
...
137

which is due to https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.exitcode

If the child terminated due to an exception not caught within run(), the exit code will be 1. If it was terminated by signal N, the exit code will be the negative value -N.

But isn't it a "Python specific behavior" and generally we should store unsigned ints? WDYT @jwodder ?

asmacdo commented 3 weeks ago

Weird, even with that behavior I was expecting the error to be -137 not -9.

jwodder commented 3 weeks ago

If we're talking about what exit code duct should print when the subprocess is killed by a signal, I think "Exit Code: {n}" should be forgone in such circumstances and be replaced by something like "Killed by signal: {signal_name}".

If we're talking about what exit code duct should exit with when the subprocess is killed by a signal, I think we should at least try to follow POSIX/SUS's rules for $? in shells:

if the command terminated due to the receipt of a signal, the shell shall assign it an exit status greater than 128. The exit status shall identify, in an implementation-defined manner, which signal terminated the command. Note that shell implementations are permitted to assign an exit status greater than 255 if a command terminates due to a signal.

yarikoptic commented 3 weeks ago

good idea on reporting a signal name, let's do that @asmacdo .

As for exit -- we need to fix it, since ATM we wrap it around instead of adding abs one to 128:

❯ duct strace -t -f -o /tmp/`mdate`-nvidia-smi-strace-1.log nvidia-smi ; echo $?
2024-08-20T09:30:32-0400 [INFO    ] con-duct: duct is executing 'strace -t -f -o /tmp/20240820-nvidia-smi-strace-1.log nvidia-smi'...
2024-08-20T09:30:32-0400 [INFO    ] con-duct: Log files will be written to .duct/logs/2024.08.20T09.30.32-3229767_
Tue Aug 20 09:30:34 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
2024-08-20T09:30:54-0400 [INFO    ] con-duct: Summary:
Exit Code: -9
...
duct strace -t -f -o /tmp/`mdate`-nvidia-smi-strace-1.log nvidia-smi  2.32s user 13.71s system 71% cpu 22.288 total
247