ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
189 stars 31 forks source link

pty test random failure #653

Closed posguy99 closed 1 year ago

posguy99 commented 1 year ago

Just putting this out there. I know it's a race condition, you can run the test over and over and it'll fail fail fail, then it'll pass, then it'll fail again. This isn't new, it's just got enough inertia that I opened an issue for it.

ksh@504d645f on m1Pro macOS 12.6.6

...
test pty begins at 2023-06-04+18:41:50
test pty passed at 2023-06-04+18:42:28 [ 58 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-04+18:42:28
    pty.sh[991]: FAIL: tab completion with space in string and -o noglob on: line 1001: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.85374.15052/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.85374.15052/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-04+18:43:06 with exit code 1 [ 58 tests 1 error ]
...
posguy99 commented 1 year ago

I should add that it doesn't happen on Debian Linux, but happens on both ASi and Intel macOS.

McDutchie commented 1 year ago

I'm also running macOS 12.6.6 on M1 Pro and I cannot reproduce this. I've been running the pty regression tests in a loop for half an hour and they all just keep passing.

posguy99 commented 1 year ago

Just now...

$ while true ; do bin/shtests pty && break ;  done         
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+12:31:07
test pty passed at 2023-06-07+12:31:45 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+12:31:45
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.24869.12367/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.24869.12367/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+12:32:24 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.016s
tests:     0m01.079s    0m02.397s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+12:32:24
test pty passed at 2023-06-07+12:33:02 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+12:33:02
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.25812.31614/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.25812.31614/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+12:33:41 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.011s
tests:     0m01.059s    0m02.219s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+12:33:41
test pty passed at 2023-06-07+12:34:19 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+12:34:19
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.26740.15146/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.26740.15146/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+12:34:58 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.010s
tests:     0m01.081s    0m02.247s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+12:34:58
test pty passed at 2023-06-07+12:35:36 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+12:35:36
test pty(C.UTF-8) passed at 2023-06-07+12:36:14 [ 59 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.003s    0m00.010s
tests:     0m01.059s    0m02.218s
$ 
McDutchie commented 1 year ago

Do you compile ksh with any custom CCFLAGS?

posguy99 commented 1 year ago

No. And whenever I build, I start cold.

$ bin/package clean && bin/package make && bin/package test

posguy99 commented 1 year ago

Ok, and now, I just built @fa03647f. And all tests passed first time.

Then I did this, and got this:

$ while true ; do bin/shtests pty && break ;  done         
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:27:38
test pty passed at 2023-06-07+19:28:16 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:28:16
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.39175.19229/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.39175.19229/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:28:56 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.013s
tests:     0m01.120s    0m02.356s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:28:57
    pty.sh[732]: FAIL: crash after switching from emacs to vi mode: line 745: read timeout
test pty failed at 2023-06-07+19:29:36 with exit code 1 [ 59 tests 1 error ]
test pty(C.UTF-8) begins at 2023-06-07+19:29:36
    pty.sh[783]: FAIL: Using b, B, w and W commands in vi mode: line 792: expected "^:test-2: echo asdf\r\n$", got ":test-1: set -o vi\r\n"
    pty.sh[783]: FAIL: Using b, B, w and W commands in vi mode: line 793: expected "^asdf\r\n$", got ":test-2: echo asdf^[bwBWa\r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:30:16 with exit code 2 [ 59 tests 2 errors ]
Total errors: 3
CPU time       user:      system:
main:      0m00.004s    0m00.012s
tests:     0m01.323s    0m02.676s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:30:16
test pty passed at 2023-06-07+19:30:55 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:30:55
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.41263.14944/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.41263.14944/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:31:33 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.012s
tests:     0m01.424s    0m02.690s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:31:34
test pty passed at 2023-06-07+19:32:12 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:32:12
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.42259.10263/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.42259.10263/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:32:53 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.013s
tests:     0m01.380s    0m02.638s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:32:53
test pty passed at 2023-06-07+19:33:31 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:33:31
    pty.sh[611]: FAIL: race condition while launching external commands: line 621: expected "^/dev/null\r\n$", got ":test-1: printf '%s\\n' 1 2 3 4 5 | while read; do ls /dev/null; done\r\n"
    pty.sh[611]: FAIL: race condition while launching external commands: line 626: expected "^:test-2:", got "/dev/null\r\n"
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.43230.22799/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.43230.22799/pty.C.UTF-8/foo\\ bar \r\n"
    pty.sh[1262]: FAIL: command completion after init and after TERM change: line 1269: expected "cmd_complete_me \r\n$", got ":test-1: cmd_complet\t\r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:34:09 with exit code 4 [ 59 tests 4 errors ]
Total errors: 4
CPU time       user:      system:
main:      0m00.004s    0m00.013s
tests:     0m01.342s    0m02.823s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:34:09
test pty passed at 2023-06-07+19:34:50 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:34:50
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.44217.27777/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.44217.27777/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:35:29 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.012s
tests:     0m01.341s    0m02.593s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:35:29
test pty passed at 2023-06-07+19:36:07 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:36:07
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.45192.15947/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.45192.15947/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+19:36:46 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.013s
tests:     0m01.359s    0m02.629s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+19:36:46
test pty passed at 2023-06-07+19:37:25 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+19:37:25
test pty(C.UTF-8) passed at 2023-06-07+19:38:03 [ 59 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.004s    0m00.012s
tests:     0m01.369s    0m02.668s
[ 7:38 PM][ttys000 +1][~/src/ksh][dev@fa03647f]
$ 

All that other failure in the second test is new. Did the same thing again and it's back to the fail fail on line 1009 and finally after three tries it passes it again.

$ while true ; do bin/shtests pty && break ;  done
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+20:09:46
test pty passed at 2023-06-07+20:10:24 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+20:10:24
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.48859.15174/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.48859.15174/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+20:11:03 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.011s
tests:     0m00.952s    0m01.804s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+20:11:03
test pty passed at 2023-06-07+20:11:41 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+20:11:41
    pty.sh[999]: FAIL: tab completion with space in string and -o noglob on: line 1009: expected "^:test-2: echo /private/var/folders/_h/w1zkv1tx5vq028wcyv5yhnnr0000gn/T/ksh93.shtests.49841.13850/pty.C.UTF-8/foo\\ bar \r\n$", got ":test-2: 000gn/T/ksh93.shtests.49841.13850/pty.C.UTF-8/foo\\ bar \r\n"
test pty(C.UTF-8) failed at 2023-06-07+20:12:20 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.009s
tests:     0m01.013s    0m01.755s
#### Regression-testing /Users/mwilson/src/ksh/arch/darwin.arm64-64/bin/ksh ####
test pty begins at 2023-06-07+20:12:20
test pty passed at 2023-06-07+20:12:57 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-07+20:12:57
test pty(C.UTF-8) passed at 2023-06-07+20:13:36 [ 59 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.003s    0m00.010s
tests:     0m01.145s    0m02.170s
[ 8:13 PM][ttys000 +1][~/src/ksh][dev@fa03647f]
$ 
JohnoKing commented 1 year ago

FWIW, I just got the test failure below (on Arch Linux):

#### Regression-testing /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/bin/ksh ####
test pty begins at 2023-06-08+08:18:03
test pty passed at 2023-06-08+08:18:42 [ 59 tests 0 errors ]
test pty(C.UTF-8) begins at 2023-06-08+08:18:42
    pty.sh[111]: FAIL: POSIX sh 026(C): line 127: expected "(Stopped|Suspended)", got EOF
test pty(C.UTF-8) failed at 2023-06-08+08:19:23 with exit code 1 [ 59 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.003s    0m00.003s
tests:     0m00.590s    0m00.202s

I can only get this to happen while the CPU usage on all cores is around 100% during a heavy workload.

McDutchie commented 1 year ago

@JohnoKing's failure at least makes some sense: there does seem to be a race condition in job notifications that may stop them from being printed before the next prompt. I have seen that failure before.

I'm completely stumped about @posguy99's report though. I cannot reproduce it, though I'm running an identical OS and hardware and compile with the same default flags. I don't even know where to begin looking.

Since these do appear to be two different problems it would be best to discuss them in separate issues to avoid confusing the discussion.

posguy99 commented 1 year ago

That last one I was doing classwork while it ran, several Chrome windows and some other stuff, load average was hovering around 4.something, but I didn't think to look in Activity Monitor.

posguy99 commented 1 year ago

I just did it again, load was around 1.5 and in Activity Monitor the CPU graph had barely moved off the bottom.

McDutchie commented 1 year ago

The pty regression tests are now running in a loop with my CPU completely maxed out (8 parallel 100% CPU infinite loops). No failures. Still stumped.

posguy99 commented 1 year ago

I'm REALLY reaching here, but Terminal.app or iTerm2? Nah, that couldn't influence the test, could it?

McDutchie commented 1 year ago

I'm REALLY reaching here, but Terminal.app or iTerm2? Nah, that couldn't influence the test, could it?

Unbelievably, that does appear to be it.

Terminal.app: no failures.

iTerm2: the test on line 999 fails consistently.

McDutchie commented 1 year ago

I think I found the real cause of the problem.

My Terminal.app window is 132 columns wide. My iTerm window, the default 80.

When I narrow my Terminal.app window to 80 columns, the test fails equally.

In spite of pty being a pseudoterminal, ksh gets the window size from the real terminal. I don't know how to fix that.

By default, the pty tests are run with TERM=dumb, on which multiline editing is not possible. On macOS, the $TMPDIR directory path is long, causing a temporary file path that won't fit on a single line. This explains the regression test failure with the cut-off path.

The fix: run the test with multiline enabled. Please try this:

diff --git a/src/cmd/ksh93/tests/pty.sh b/src/cmd/ksh93/tests/pty.sh
index 0484944ca..786197ac3 100755
--- a/src/cmd/ksh93/tests/pty.sh
+++ b/src/cmd/ksh93/tests/pty.sh
@@ -996,7 +996,7 @@ w exit
 !

 touch "$tmp/foo bar"
-((SHOPT_VSH || SHOPT_ESH)) && tst $LINENO <<!
+((multiline && (SHOPT_VSH || SHOPT_ESH))) && TERM=vt100 tst $LINENO <<!
 L tab completion with space in string and -o noglob on
 # https://github.com/ksh93/ksh/pull/413
 # Amended to test that completion keeps working after -o noglob
posguy99 commented 1 year ago

Confirmed that with the patch in https://github.com/ksh93/ksh/issues/653#issuecomment-1583708086, I no longer get the pty test failure.

I use Terminal.app, not iTerm2, and my window width is 120 columns on both the MBP and the iMac. So apparently I am right against that length limit, since it would fail, then pass, then fail again? I widened the window to 160 columns and the original pty test no longer fails here.

posguy99 commented 1 year ago

Have you ever seen this one? I have the pty test in a continuous loop, and this went by once.

test pty(C.UTF-8) begins at 2023-06-08+19:07:56
    pty.sh[858]: FAIL: tab completion executes command substitutions: line 866: expected "^:test-1: \$\(echo true\)\r\n$", got ":test-1: $(echo true)\t\r\n"
test pty(C.UTF-8) failed at 2023-06-08+19:08:35 with exit code 1 [ 59 tests 1 error ]
McDutchie commented 1 year ago

So apparently I am right against that length limit, since it would fail, then pass, then fail again?

Yes. The length of the $TMPDIR prefix is constant but the temporary subdirectory name (e.g., ksh93.shtests.85374.15052) varies in length because it contains a process ID and a random number, so that would explain that.

McDutchie commented 1 year ago

Have you ever seen this one? I have the pty test in a continuous loop, and this went by once.

test pty(C.UTF-8) begins at 2023-06-08+19:07:56
  pty.sh[858]: FAIL: tab completion executes command substitutions: line 866: expected "^:test-1: \$\(echo true\)\r\n$", got ":test-1: $(echo true)\t\r\n"
test pty(C.UTF-8) failed at 2023-06-08+19:08:35 with exit code 1 [ 59 tests 1 error ]

Never seen that one.