oilshell / oil

Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
http://www.oilshell.org/
Other
2.78k stars 150 forks source link

SIGINT and SIGTSTP breaks out of loops in bash/zsh, but not OSH #1970

Open greggyb opened 1 month ago

greggyb commented 1 month ago

EDIT: Forgot to mention running 0.21.0 release on Ubuntu 22.04; built with no customization, just ./configure && _build/oils.sh && sudo ./install.

In Bash:

$ x=0
$ while test $x -le 10; do printf "x is %i\n" $x; x=$((x+1)); sleep 1; done
x is 0
x is 1
^C
$ 

In osh:

$ x=0
$ while test $x -le 10; do printf "x is %i\n" $x; x=$((x+1)); sleep 1; done
x is 0
x is 1
^C
x is 2
^C
x is 3
^C
x is 4
^C
x is 5
^C
x is 6
^C
x is 7
^C
x is 8
^C
x is 9
^C
x is 10
^C
$ 

If you accidentally write a loop wrong and want to terminate it or if you accidentally write an infinite loop, you need to kill the shell from somewhere else. I was in tmux, so it was easy to kill the pane. Hopefully your loop is CPU intensive so you can find it more easily.

It seems like the signal is being passed to the sleep command (or whatever is running in the loop), and we never hit the while loop. I assume Bash is doing some special handling here?

An additional difference, but not sure what the correct behavior should be, with Ctrl-z. In Bash, Ctrl-Z hits the running process in the loop, and fg foregrounds that process, but terminates the loop:

In Bash:

$ x=1
$ while test $x -le 10; do printf 'x is %i\n' $x; sleep 1; x=$((x+1)); done
x is 1
x is 2
^Z
[1]+  Stopped                 sleep 1
$ jobs
[1]+  Stopped                 sleep 1
$ fg
sleep 1
$ jobs
$ 

In osh (note a different PID is being stopped each time

$ x=1
$ while test $x -le 10; do printf 'x is %i\n' $x; sleep 1; x=$((x+1)); done
x is 1
^Z
[PID 235647] Stopped with signal 20
x is 2
^Z
[PID 235653] Stopped with signal 20
x is 3
^Z
[PID 235654] Stopped with signal 20
x is 4
^Z
[PID 235655] Stopped with signal 20
x is 5
^Z
[PID 235658] Stopped with signal 20
x is 6
x is 7
x is 8
x is 9
x is 10
$ jobs
%1 235647 Stopped [process] sleep 1
%2 235653 Stopped [process] sleep 1
%3 235654 Stopped [process] sleep 1
%4 235655 Stopped [process] sleep 1
%5 235658 Stopped [process] sleep 1
$ fg
Continue PID 235658
$ fg
Continue PID 235655
$ fg
Continue PID 235654
$ fg
Continue PID 235653
$ fg
Continue PID 235647
$ jobs
$ 

The SIGTSTP is passed to each sleep process, so we end up with many backgrounded jobs. When the sleep is backgrounded, control returns immediately to the loop in osh, and the next line is printed without delay.

Same exact behaviors identified in for x in 1 2 3 ...; do ...; done loops with the same loop body.

andychu commented 1 month ago

OK interesting, I reproduced this

I guess bash turns it into 'break' essentially.

Although I noticed it breaks out of nested loops too, so not quite

$ x=0 y=0
andy@hoover:~/git/oilshell/oil$ while test $y -le 10; do while test $x -lt 10; do printf "x is %i\n" $x; x=$((x+1)); sleep 1; done; y=$((y+1)); done
x is 0
x is 1
^C
greggyb commented 1 month ago

For SIGINT, yes, it seems like a super-break. For SIGTSTP, though, it does stop the currently running command in the loop body, and that command can be resumed afterward, so it is a combination of SIGTSTP for the command executing in the loop body and breaking the loop.

andychu commented 1 month ago

Oh geez :-/

Is this something you hit in normal usage, or just testing out how OSH behaves?

greggyb commented 1 month ago

Sometimes I forget that watch exists when I'm on Linux and do while true; do <command>; sleep 1; done. And on systems that don't have watch installed by default (FreeBSD and OpenBSD for me), I've never actually installed it, so I use a loop like above to get the same effect for myself.

I don't know that I have ever cared, nor would I ever care in the future, what happens to the commands in the loop body, but I do want SIGINT to break out of the loop for me. Not sure I care about SIGTSTP doing anything special. Honestly, I'd probably be just as happy with any keyboard-sendable signal that can break out of a running loop interactively. That is important. I don't think I care about the specifics.

As for others? As I'm sure you know better than most, just about everything in Bash is likely depended on by someone, somewhere.

greggyb commented 1 month ago

In the meantime, wrapping the loop in a subshell gets me kill-ability with SIGINT:

$ (x=1; while true; ....)
x is 1
x is 2
^C
$ 

So, for me in the short term this is not a huge issue. I think it is worth having similar SIGINT behavior to Bash, eventually.

greggyb commented 1 month ago

Sorry to spam with multiple posts in a row. I'm thinking about the subshell I just posted. I think you could make a strong correctness argument that that's how one should approach such things. In general, SIGINT and SIGTSTP do nothing to the running shell; they are only sent to the currently running command. I'm not sure, but are any shell builtins affected by SIGINT or SIGTSTP? If not, then there's a good argument to be made that loops shouldn't get special cases, and that the "correct" thing to do is to use a subshell if you want to send signals, because then you can properly send the signal to the process (the subshell) and have it work normally with the signals.

Specifically, a loop in a subshell (or any number of nested) can be terminated with SIGINT, just like any other process. And the whole loop can be suspended with SIGTSTP and resumed midway, rather than the funky behavior of suspending the currently running loop-body command, and making that one resumable, while breaking out of the loop.

Instead you get:

$ (for x in 1 2 3 4 5; do printf 'x is %i\n' $x; sleep 1; done)
x is 1
x is 2
x is 3
^Z
[PID 247893] Stopped with signal 20
$ fg
Continue PID 247893
x is 4
x is 5
$

And, if you ask me what it means to suspend a loop that looks a lot more reasonable.

If you decide not to change this behavior, I think you're on solid ground with regard to consistency, but it is definitely something worth documenting in the differences from Bash page.