Closed McDutchie closed 2 years ago
Perhaps setjmp
can be used to fix the error handling segfault. Solaris patch 330-30841535.patch claims to fix a coredump caused by improper error handling. Repurposing the idea from that patch for .sh.tilde
seems to fix the memory faults from inserting trap -Q
in the discipline function (although it does nothing for SIGINT
). Patch for testing (causes no regression test fails on my end):
--- a/src/cmd/ksh93/sh/macro.c
+++ b/src/cmd/ksh93/sh/macro.c
@@ -2638,6 +2638,7 @@ static void tilde_expand2(Shell_t *shp, register int offset)
{
stakfreeze(1); /* terminate current stack object to avoid data corruption */
block++;
+ setjmp(*shp->jmplist);
nv_putval(SH_TILDENOD, &stakp[offset], 0);
cp = nv_getval(SH_TILDENOD);
block--;
# Results after applying the patch above
$ arch/*/bin/ksh
$ echo ~
arch/linux.i386-64/bin/ksh[44]: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
/home/johno
ksh $ echo ~ # Press tab
arch/linux.i386-64/bin/ksh[44]: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
/home/johno
/home/johno
$ echo ~ksh
arch/linux.i386-64/bin/ksh[44]: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
~ksh
Edit: See also https://github.com/ksh93/ksh/issues/347#issuecomment-979456879, which demonstrates this bug likely applies to all discipline functions.
As expected, the discipline function for PS1
is also affected:
$ PS1.get() { trap -Q; }
/usr/bin/ksh: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
/usr/bin/ksh: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
# Freezes
Also as expected, setjmp
avoids that lockup (although setjmp
before PS1.get
is handled causes a segfault when the shell receives SIGINT
).
After yet more experimentation I've come up with this patch (edit: patch updated to avoid using shp
):
--- a/src/cmd/ksh93/sh/nvdisc.c
+++ b/src/cmd/ksh93/sh/nvdisc.c
@@ -283,11 +283,16 @@ static void assign(Namval_t *np,const char* val,int flags,Namfun_t *handle)
nq = vp->disc[type=UNASSIGN];
if(nq && !isblocked(bp,type))
{
- int bflag=0, savexit=sh.savexit;
+ int bflag=0, savexit=sh.savexit, jmpval=0;
+ struct checkpt buff;
block(bp,type);
if (type==APPEND && (bflag= !isblocked(bp,LOOKUPS)))
block(bp,LOOKUPS);
- sh_fun(nq,np,(char**)0);
+ sh_pushcontext(&sh,&buff,1);
+ jmpval = sigsetjmp(buff.buff,0);
+ if(!jmpval)
+ sh_fun(nq,np,(char**)0);
+ sh_popcontext(&sh,&buff);
unblock(bp,type);
if(bflag)
unblock(bp,LOOKUPS);
@@ -376,7 +381,8 @@ static char* lookup(Namval_t *np, int type, Sfdouble_t *dp,Namfun_t *handle)
union Value *up = np->nvalue.up;
if(nq && !isblocked(bp,type))
{
- int savexit = sh.savexit;
+ int savexit = sh.savexit, jmpval = 0;
+ struct checkpt buff;
node = *SH_VALNOD;
if(!nv_isnull(SH_VALNOD))
{
@@ -389,7 +395,11 @@ static char* lookup(Namval_t *np, int type, Sfdouble_t *dp,Namfun_t *handle)
nv_setsize(SH_VALNOD,10);
}
block(bp,type);
- sh_fun(nq,np,(char**)0);
+ sh_pushcontext(&sh,&buff,1);
+ jmpval = sigsetjmp(buff.buff,0);
+ if(!jmpval)
+ sh_fun(nq,np,(char**)0);
+ sh_popcontext(&sh,&buff);
unblock(bp,type);
if(!vp->disc[type])
chktfree(np,vp);
It makes ksh do a longjmp back to the correct context after a discipline function exits with an error. The .sh.tilde
reproducer now works correctly. Despite that the PS2.get
discipline function can still crash, although it now crashes in the dcl_dehacktivate
function added in commit 5d353210:
$ test \
/home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/bin/ksh[138]: trap: -Q: unknown option
Usage: trap [-p] [action condition ...]
Help: trap [ --help | --man ] 2>&1
> true
Program received signal SIGABRT, Aborted.
0x00007ffff7cabd22 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff7cabd22 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff7c95862 in abort () from /usr/lib/libc.so.6
#2 0x00005555555b20e3 in dcl_dehacktivate () at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:278
#3 0x00005555555b2ebb in sh_cmd (lexp=0x5555556fdba0, sym=10, flag=132) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:591
#4 0x00005555555b2950 in sh_parse (shp=0x5555556f5580 <sh>, iop=0x5555556f3a80 <_Sfstdin>, flag=0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:440
#5 0x00005555555684f0 in exfile (shp=0x2, iop=0x7fffffffdd40, fno=0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:571
#6 0x0000555555567aee in sh_main (ac=1, av=0x7fffffffe468, userinit=0x0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:361
#7 0x0000555555566e1e in main (argc=1, argv=0x7fffffffe468) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:46
(gdb) frame 2
#2 0x00005555555b20e3 in dcl_dehacktivate () at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:278
278 abort();
(gdb) frame 3
#3 0x00005555555b2ebb in sh_cmd (lexp=0x5555556fdba0, sym=10, flag=132) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:591
591 dcl_dehacktivate();
(gdb) frame 4
#4 0x00005555555b2950 in sh_parse (shp=0x5555556f5580 <sh>, iop=0x5555556f3a80 <_Sfstdin>, flag=0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:440
440 t = sh_cmd(lexp,(flag&SH_EOF)?EOFSYM:'\n',SH_SEMI|SH_EMPTY|(flag&SH_NL))
Does this fix the crash in dcl_dehacktivate()
?
--- a/src/cmd/ksh93/sh/parse.c
+++ b/src/cmd/ksh93/sh/parse.c
@@ -284,6 +284,7 @@ static void dcl_dehacktivate(void)
}
static noreturn void dcl_exit(int e)
{
+ dcl_recursion = 1;
dcl_dehacktivate();
(*error_info.exit)(e);
UNREACHABLE();
BTW, side note: let's not bother with introducing more pointless shp
pointers as I'm going to remove them all at some point anyway. No matter what overly clever thing AT&T might have been planning, we're never going to have multiple shell states at the same time. Where something demands to be passed a pointer to the shell state, you can use &sh
instead. (Until recently that threw a syntax error in the sh_pushcontext
and sh_popcontext
macros but I fixed that in 802e8db75206d79ce82a376a9edd6274a556cb23.)
Does this fix the crash in
dcl_dehacktivate()
?
Nope, it still crashes.
Program received signal SIGABRT, Aborted.
0x00007ffff7cabd22 in raise () from /usr/lib/libc.so.6
(gdb) p dcl_recursion
$1 = 0
(gdb) p dcl_tree
$2 = (Dt_t *) 0x55555571edc0
(gdb) bt
#0 0x00007ffff7cabd22 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff7c95862 in abort () from /usr/lib/libc.so.6
#2 0x00005555555b20e3 in dcl_dehacktivate () at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:278
#3 0x00005555555b2ec5 in sh_cmd (lexp=0x5555556fdb40, sym=10, flag=132) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:592
#4 0x00005555555b295a in sh_parse (shp=0x5555556f5580 <sh>, iop=0x5555556f3a80 <_Sfstdin>, flag=0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:441
#5 0x00005555555684f0 in exfile (shp=0x2, iop=0x7fffffffdda0, fno=0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:571
#6 0x0000555555567aee in sh_main (ac=1, av=0x7fffffffe4c8, userinit=0x0) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:361
#7 0x0000555555566e1e in main (argc=1, argv=0x7fffffffe4c8) at /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:46
BTW, side note: let's not bother with introducing more pointless shp pointers as I'm going to remove them all at some point anyway.
I've updated the patch to avoid using shp
pointers.
I think I've made sense of why that crash occurs. The PS2 discipline is run while trying to parse a command. So it's another context where arbitrary code might be executed at parse time, which messes up that hack. Aaaargh. Apparently at AT&T they'd never heard about the idea of having a clean separation between distinct subsystems! Anyway, the backtrace confirms that hypothesis, as it includes sh_parse()
. It'll need a fix similar to this one.
edit: But that'll have to be for later, I simply don't have time for that right now. I've reverted that commit until it can be fixed.
Defining a .sh.tilde.get or .sh.tilde.set discipline function to extend tilde expansion works well as long as the discipline function doesn't get interrupted (e.g. with Crtl+C) or produce an error message. Either of those will cause the shell to become unstable and crash.
To reproduce: To trigger an interrupt, adding 'sleep 1' to the discipline function makes it easy to press Ctrl+C. To trigger an error, give an invalid option to a special built-in, e.g.
trap -Q
.I have experimented with two different techniques to fix this, with no success at all so far.
sh_pushcontext
is getting me nowhere and running the command viash_trap
doesn't seem to help either.A workaround for the interruption crash is to make it uninterruptible with
sigblock(SIGINT)
intilde_expand2()
before running the discipline andsigrelease(SIGINT)
after. But that does nothing to fix the crash on error.This feature should probably be removed from the 1.0 branch as it is not ready for prime time. It can return to a release branch if/when we manage to fix it on the master branch.