ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
189 stars 31 forks source link

Some bugs and crashes when using the DEBUG trap #155

Closed ko1nksm closed 3 years ago

ko1nksm commented 3 years ago

Tested with ksh Version AJMv 93u+m/1.0.0-alpha+ccd98fe7 2021-01-08 on Ubuntu 20.04

1. Redirecting disables the DEBUG trap

edit: fixed in 2a835a2d8a7979d37820705087ad43bcdadc5201, e664b78f980fa31bd5e4c5755ce025dd5dae5618

trap 'echo LINENO: $LINENO >&1' DEBUG # 1 
echo foo                              # 2
var=$(echo)                           # 3
echo bar                              # 4
echo baz                              # 5

Expected (This is what happens when you remove >&1).

LINENO: 2
foo
LINENO: 3
LINENO: 4
bar
LINENO: 5
baz

Actual

LINENO: 2
foo
bar
baz

2. Crashes when re-trapping inside a subshell

edit: fixed in 2a835a2d8a7979d37820705087ad43bcdadc5201

set -e
trap ':' DEBUG
( trap ':' DEBUG )
echo ok
ksh.sh[22]: ^P?`?^X?%?d: not found [No such file or directory]

3. The split word will not work

edit: fixed in 70368c57d67c6d3f327a52de39697e8c6647ee7a

v=''
trap ': $v' DEBUG
A="a b c"
set -- $A
printf '%s\n' "$@"

Expected

a
b
c

Actual

a b c

4. Side effects to exit status

edit: fixed in d00b4b39f66e2912797195f2aac9913218826350

trap ':' DEBUG

(exit 123)
echo $? # => 123 (Correct)

r=$(exit 123)
echo $? # => Same as above, expected 123, but actually 0.
McDutchie commented 3 years ago
  1. Confirmed also on ksh 93u+ (2012-08-01), and ksh2020 (which was based on 93v- beta; both are abandoned).
  2. Confirmed also on ksh 93u+, but not ksh2020. There may be a fix to backport. edit: in fact, ksh2020 has the bug, the crash is just intermittent. If it doesn't crash, it leaks. See: https://github.com/ksh93/ksh/issues/155#issuecomment-766216838
  3. Confirmed also on ksh 93u+ and ksh2020.
  4. Confirmed also on ksh 93u+ and ksh2020.

So, at least this 93u+m fork didn't introduce any of these bugs ;-). Downside of that is that these longstanding bugs will probably be difficult to track down.

ko1nksm commented 3 years ago

Thank you, I should have reported it. These are not bugs introduced in ksh93u+m, but bugs that I suffered from ksh93u+ and ksh2020.

McDutchie commented 3 years ago

No problem. Note that 93u+m is currently the only actively developed ksh branch (that I know of). I rebooted development based on 93u+ in May 2020 after ksh2020 was abandoned and the ksh-community project went nowhere. A first beta is due out soon. I hope you'll test it with shellspec.

McDutchie commented 3 years ago

Bug 4 (side effects to exit status) is an easy fix:

diff --git a/src/cmd/ksh93/sh/fault.c b/src/cmd/ksh93/sh/fault.c
index 440e761..72b3eea 100644
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -489,7 +489,7 @@ int sh_trap(const char *trap, int mode)
    sh_popcontext(shp,&buff);
    shp->intrap--;
    sfsync(shp->outpool);
-   if(!shp->indebug && jmpval!=SH_JMPEXIT && jmpval!=SH_JMPFUN)
+   if(jmpval!=SH_JMPEXIT && jmpval!=SH_JMPFUN)
        shp->exitval=savxit;
    stakset(savptr,staktop);
    fcrestore(&savefc);
ko1nksm commented 3 years ago

It's still experimental, but I've started testing 93u+m (https://github.com/shellspec/shellspec/commit/aed1c899c23e4a9f89115b85e7a3bb15b5a46662). It seems to be working fine.

I'm glad to see that the following code now works correctly with 93u+m. It was not working with ksh2020 on Ubuntu 20.04 either.

# I'm working around this issue with using `alias`.
# shellspec_redefinable() { eval "alias $1='shellspec_redefinable_ $1'"; }
# shellspec_redefinable_() { "$@"; }
#
# shellspec_redefinable f

f() { echo foo; }
(
  f() { echo bar; }
  f # The output is "foo" for ksh93u+ and ksh2020.
)

Bug 4 (side effects to exit status) is an easy fix:

That's impressive! I could not find a workaround on the shell script side of it.

Thank you for rebooting the development of ksh!

McDutchie commented 3 years ago

Cool, glad to see that. :) The fix for redefining a running function was part of 047cb3303c7ea80a9cfde74c517b0496093abb65.

McDutchie commented 3 years ago

Re bug 2 (Crashes when re-trapping inside a subshell), it looks like the problem is that the trap action is not correctly restored after exiting a virtual subshell. It is persistently corrupted.

set -x
trap ':' DEBUG
( trap ':' DEBUG )
echo one
echo two
echo three

Output on my system (macOS 10.14.6):

$ arch/*/bin/ksh test.sh
+ trap : DEBUG
+ trap : DEBUG
+ :
+ echo one
+ $'*\x1d\x06\x01'
test.sh[4]: *: not found [No such file or directory]
one
+ echo two
+ $'\xa8o\r\x06\x01'
: not found [No such file or directory]
two
+ echo three
+ $'\xa8o\r\x06\x01'
: not found [No such file or directory]
three
McDutchie commented 3 years ago

The crash caused by bug 2 is intermittent for me, but when it doesn't crash, it leaks:

for sig in ERR KEYBD DEBUG
do  trap ': main' $sig
    ( trap ": LEAK : $sig" $sig )
    trap
    trap - "$sig"
done

Output (occasionally):

trap -- ': main' ERR
trap -- ': main' KEYBD
trap -- ': LEAK' DEBUG

The problem seems to be an off-by-one error when resetting/restoring the traps. This fix works for me:

diff --git a/src/cmd/ksh93/sh/fault.c b/src/cmd/ksh93/sh/fault.c
index 72b3eea..b518d97 100644
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -344,7 +344,7 @@ void        sh_sigreset(register int mode)
                        sh.sigflag[sig] = flag;
                }
        }
-       for(sig=SH_DEBUGTRAP-1;sig>=0;sig--)
+       for(sig=SH_DEBUGTRAP;sig>=0;sig--)
        {
                if(trap=sh.st.trap[sig])
                {

Output after fix:

trap -- ': main' ERR
trap -- ': main' KEYBD
trap -- ': main' DEBUG

...and your own test case passes as well.

McDutchie commented 3 years ago

Well, look at that: the same fix fixed bug 1 as well.

ko1nksm commented 3 years ago

Thanks for the fix! But another problem arose. It seems that the DEBUG trap is not being called in the subshell.

ShellSpec uses the DEBUG trap to measure coverage, and this problem has caused ShellSpec's self coverage to drop by about 10%.

echo ${.sh.version}

trap 'echo "[LINENO] $LINENO"' DEBUG

echo foo:5
( echo bar:6 )
echo baz:7
Version AJMv 93u+m/1.0.0-alpha+2a835a2d 2021-01-23
[LINENO] 5
foo:5
bar:6
[LINENO] 7
baz:7
Version AJM 93u+ 2012-08-01
[LINENO] 5
foo:5
[LINENO] 6
bar:6
[LINENO] 7
baz:7
Version A 2020.0.0
[LINENO] 5
foo:5
[LINENO] 6
bar:6
[LINENO] 7
baz:7
ko1nksm commented 3 years ago

Interestingly, I had the same problem with ksh93s+. Of course, the cause may not be the same.

Version M 93s+ 2008-01-31
[LINENO] 5
foo:5
bar:6
[LINENO] 7
baz:7

Version JM 93u+ 2012-02-29
[LINENO] 5
foo:5
[LINENO] 6
bar:6
[LINENO] 7
baz:7
McDutchie commented 3 years ago

Thanks for the fix! But another problem arose. It seems that the DEBUG trap is not being called in the subshell.

I believe that behaviour is correct. By definition, a subshell is initialised as a complete copy of the parent environment, except for traps. All traps are reset upon entering a subshell.

It is similar for ksh-style functions (defined using the function keyword). The KornShell book (M.I. Bolsky & D.G. Korn, 1995) documents that DEBUG traps are not inherited by functions. This is also the behaviour of 93s+ (1993). But on 93u+ (and ksh2020), they are inherited by functions -- presumably due to that same off-by-one error. On 93u+m, they are now again not inherited by functions, as documented.

Note that bash does the same thing (after changing ${.sh.version} to $BASH_VERSION):

$ bash test.sh
5.1.0(30)-rc2
[LINENO] 5
foo:5
bar:6
[LINENO] 7
baz:7

Hmmm... but bash has a -T option that causes the DEBUG trap to be inherited by functions and subshells. I could add a similar option to ksh, once I figure out how to do it properly. Perhaps this was the intention behind that off-by-one error – but if it was, it's clearly not the right way to do it. And the change was not documented, either.

McDutchie commented 3 years ago

Meanwhile, I think I found a fix for bug 3 (field splitting breakage). It would not be the first time that removing some code fixes something in ksh...

diff --git a/src/cmd/ksh93/sh/fault.c b/src/cmd/ksh93/sh/fault.c
index 1873b63..91378ed 100644
--- a/src/cmd/ksh93/sh/fault.c
+++ b/src/cmd/ksh93/sh/fault.c
@@ -451,11 +451,9 @@ int sh_trap(const char *trap, int mode)
    int was_verbose = sh_isstate(SH_VERBOSE);
    int staktop = staktell();
    char    *savptr = stakfreeze(0);
-   char    ifstable[256];
    struct  checkpt buff;
    Fcin_t  savefc;
    fcsave(&savefc);
-   memcpy(ifstable,shp->ifstable,sizeof(ifstable));
    sh_offstate(SH_HISTORY);
    sh_offstate(SH_VERBOSE);
    shp->intrap++;
@@ -493,7 +491,6 @@ int sh_trap(const char *trap, int mode)
        shp->exitval=savxit;
    stakset(savptr,staktop);
    fcrestore(&savefc);
-   memcpy(shp->ifstable,ifstable,sizeof(ifstable));
    if(was_history)
        sh_onstate(SH_HISTORY);
    if(was_verbose)

Analysis: shp->ifstable (a.k.a. sh.ifstable) is an internal state table for word splitting. As an expansion is split into fields, this table is modified. If that happens within a DEBUG trap, any modifications in ifstable are undone by the restoring memcpy, leaving an inconsistent state.

I've tried hard to break ksh after applying this diff, but as far as I can tell, everything works correctly. All the regression tests pass, so do the modernish regression tests (which do a lot of funky stuff with IFS and traps), and my own experiments with various combinations of IFS and splitting inside and outside of DEBUG traps couldn't break it either.

ko1nksm commented 3 years ago

I believe that behaviour is correct. By definition, a subshell is initialised as a complete copy of the parent environment, except for traps. All traps are reset upon entering a subshell.

I respect your decision. I forgot about set -T (set -o functrace) in bash.

I've tried hard to break ksh after applying this diff, but as far as I can tell, everything works correctly.

I also applied the same diff and ran ShellSpec without any problems.

(which do a lot of funky stuff with IFS and traps)

I'll look at the code later :)

McDutchie commented 3 years ago

The very handy multishell repo allows us to use 'git blame' to trace the origin of these bugs.

The off-by-one error causing various bugs was introduced in ksh 93t 2008-07-25:
https://github.com/multishell/ksh93/commit/8e947ccf8ce2bf1fda57f04e59f4a9a4782cab99 (fault.c, line 321)

The incorrect check causing the exit status bug was introduced in ksh 93t 2008-11-04:
https://github.com/multishell/ksh93/commit/b1ade26843968b5606753eba1e0afa099daab528 (fault.c, line 459)

The ifstable save/restore causing the field splitting bug was introduced in ksh 93t+ 2009-11-30:
https://github.com/multishell/ksh93/commit/53d9f0095adece1dfe82a9f2b878fc6fc811a8bc (fault.c, lines 440, 444, 482)

So all the bugs reported in this issue were fixed by simply reverting these specific changes. I think that they are some experiments that the developers simply forgot to remove. I've suspected such a thing multiple times before. ksh93 was developed by researchers who were genius innovators, but incredibly sloppy maintainers.

avih commented 3 years ago

Did you consider pinging @dgkorn ? He doesn't seem active on github, but perhaps he'll get a notification and might have some insights...

McDutchie commented 3 years ago

I've heard from several others that Dr. Korn is inactive and doesn't reply to emails, but since you just did ping him, I'd more than welcome a comment from him if he's so inclined...

McDutchie commented 3 years ago

Re bug 3 (field splitting breakage) again: a possible workaround for pre-93u+m ksh versions is to execute a dummy split before running any trap. This initialises sh.ifstable before trapping, so that the faulty restore code doesn't render it inconsistent. Perhaps @ko1nksm would find this useful.

v=''
: $dummy
trap ': $v' DEBUG
A="a b c"
set -- $A
printf '%s\n' "$@"

(correct output on ksh 93u+)

ko1nksm commented 3 years ago

Thank you. However, unfortunately it did not work in my actual use case. It might be because I use a lot of subshells or enable/disable the DEBUG trap many times, but I'm not sure why.

I am using the following workaround. This avoids bug 1 and bug 3. I am guessing that the internal state was reset by reassigning the IFS.

trap '(echo "kcov@${.sh.file}@${LINENO}@" >/dev/fd/$KCOV_BASH_XTRACEFD); IFS=$IFS' DEBUG