ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
188 stars 31 forks source link

Race condition in job notification #655

Open McDutchie opened 1 year ago

McDutchie commented 1 year ago

Originally posted by @JohnoKing in https://github.com/ksh93/ksh/issues/653#issuecomment-1582808981

FWIW, I just got the test failure below (on Arch Linux):

#### Regression-testing /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/bin/ksh ####
test pty begins at 2023-06-08+08:18:03
test pty passed at 2023-06-08+08:18:42 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-08+08:18:42
    pty.sh[111]: FAIL: POSIX sh 026(C): line 127: expected "(Stopped|Suspended)", got EOF
test pty(C.UTF-8) failed at 2023-06-08+08:19:23 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.003s
tests:     0m00.590s    0m00.202s

I can only get this to happen while the CPU usage on all cores is around 100% during a heavy workload.

McDutchie commented 1 year ago

I have seen this before on various systems, especially slow ones, or Linux with musl libc.

There seems to be a race condition in job notifications (like Stopped or Suspended) that causes them to not always be printed before the next prompt as expected. Pressing return to get another prompt will cause them to be printed in that case.

I can reproduce this quite easily on Alpine Linux arm64 (which uses musl libc):

$ echo ${.sh.version}; uname -a
Version AJM 93u+m/1.0.5 2023-06-07
Linux bergzicht.inlv.org 6.1.30-0-virt #1-Alpine SMP Fri, 26 May 2023 06:53:59 +0000 aarch64 GNU/Linux
$ sleep 60 &
[1] 23620
$ kill -s TSTP $!
[1] + Stopped                  sleep 60 &
$ sleep 60 &     
[2] 23621
$ kill -s TSTP $!
[2] + Stopped                  sleep 60 &
$ sleep 60 &     
[3] 23622
$ kill -s TSTP $!
$ # no Stopped prompt displayed here, pressing Return will show it
[3] + Stopped                  sleep 60 &
$ sleep 60 &                                                      
[4] 23623
$ kill -s TSTP $!                                                 
$ # and again
[4] + Stopped                  sleep 60 &
$ sleep 60 &     
[5] 23624
$ kill -s TSTP $!
[5] + Stopped                  sleep 60 &
$ sleep 60 &     
[6] 23625
$ kill -s TSTP $!
[6] + Stopped                  sleep 60 &
$ sleep 60 &     
[7] 23626
$ kill -s TSTP $!
$ # and again
[7] + Stopped                  sleep 60 &
$