ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
189 stars 31 forks source link

`${array[index]=value}` doesn't work, but `${array[index]:=value}` does #383

Closed oguz-ismail closed 2 years ago

oguz-ismail commented 2 years ago

Compare:

$ unset foo
$ echo ${foo[42]=bar}

$ typeset -p foo
typeset -a foo=([0]=)
$ unset foo
$ echo ${foo[42]:=bar}
bar
$ typeset -p foo
typeset -a foo=([42]=bar)

This doesn't make any sense.

McDutchie commented 2 years ago

Indeed, it's a bug. The second reproducer is correct, the first one should be like the second.

@hyenias, you've fixed similar bugs before – do you feel like taking this one on?

hyenias commented 2 years ago

I have never worked on variable expansions before but sure I can take this on.

hyenias commented 2 years ago

Well, I have brushed up on my ${varname} expansions which by the KornShell book are called parameter expansions. I have hand tested using all the various parameter expansions using multidimensional arrays and compared those behavior to scalars and found other issues. I know what is wrong for the most part as I see it. Now to research as to why things are the way they are.

There are some multidimensional array assignment issues when using the ${parameter=word}/${parameter:=word} that are being addressed by other ongoing issues.

McDutchie commented 2 years ago

Related, probably the same bug:

$ unset foo
$ : ${foo[42]?error}
$ : ${foo[42]:?error}
ksh: foo: error

The first variant fails to throw the expected error.

JohnoKing commented 2 years ago

This bug was introduced in ksh93t 2008-10-09. Both of the reproducers work in 93t 2008-10-01.

hyenias commented 2 years ago

Thanks. I look forward to utilizing either ksh93-history repo or ast-open-archive repo to gain insights.

McDutchie commented 2 years ago

I couldn't find anything obviously relevant in the diff between 2008-10-01 and 2008-10-09.

I'm sure the problem is in varsub() in macro.c somewhere. That function is a bit of a nightmare, unfortunately.

So far I've come up with:

diff --git a/src/cmd/ksh93/sh/macro.c b/src/cmd/ksh93/sh/macro.c
index ffc2a9a68..7666257f0 100644
--- a/src/cmd/ksh93/sh/macro.c
+++ b/src/cmd/ksh93/sh/macro.c
@@ -1446,8 +1446,10 @@ retry1:
                v = nv_getvtree(np,(Namfun_t*)0);
            else
            {
-               if(type && fcpeek(0)=='+')
+               unsigned char   p;
+               if(type && ((p = fcpeek(0))=='+' || p=='=' || p=='?'))
                {
+                   /* Set/unset check for ${foo+v}, ${foo=v}, ${foo?v} */
                    if(ap)
                        v = nv_arrayisset(np,ap)?(char*)"x":0;
                    else

It looks like the check for = or ? was missing there. However, it's broken with this patch:

$ arch/*/bin/ksh -c 'foo=origval; echo ${foo=newval}'
x

The dummy literal "x" from the code is used for the value. So this is clearly not correct. But I'm struggling to come up with anything else that doesn't cause massive regressions.

JohnoKing commented 2 years ago

I found the change in 2008-10-09 that causes this bug. It's this change in array.c:

--- b/src/cmd/ksh93/sh/array.c
+++ a/src/cmd/ksh93/sh/array.c
@@ -1216,7 +1216,7 @@ Namval_t *nv_putsub(Namval_t *np,register char *sp,register long mode)
        ap->cur = size;
        if((mode&ARRAY_SCAN) && (ap->cur--,!nv_nextsub(np)))
            np = 0;
-       if(mode&ARRAY_FILL)
+       if(mode&(ARRAY_FILL|ARRAY_ADD))
        {
            if(!(mode&ARRAY_ADD))
            {

I've no clue why this patch causes the bug. Reverting it introduces a few regression test failures:

test arrays2 begins at 2022-01-29+06:58:20
    arrays2.sh[157]: FAIL: ${#ar[*]} is '1', should be 9
test arrays2 failed at 2022-01-29+06:58:20 with exit code 1 [ 56 tests 1 error ]
--- cut ---
test pointtype begins at 2022-01-29+07:00:09
/home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/tests/pointtype.sh[118]: eval: line 3: s.s.b[1]: no parent
    pointtype.sh[119]: FAIL: expansion of type containing indexed array of types is incorrect
/home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/tests/pointtype.sh: line 124: 1.z: is not an element of r
test pointtype failed at 2022-01-29+07:00:10 with exit code 1 [ 34 tests 1 error ]
--- cut ---
test treemove begins at 2022-01-29+07:02:00
    treemove.sh[151]: FAIL: typeset -m "c.board[1][i]=el" core dumps
test treemove failed at 2022-01-29+07:02:00 with exit code 1 [ 9 tests 1 error ]
McDutchie commented 2 years ago

With the failing reproducer, this code block is executed with mode==ARRAY_ADD, so just the one bit set: https://github.com/ksh93/ksh/blob/e77e61d3924316e13984e3e374500b3c1425816a/src/cmd/ksh93/sh/array.c#L1210-L1255

So, lines 1212-1232 are irrelevant.

From the rest, I've experimentally determined that lines 1250-1251 (ap->val[size].cp = Empty) cause the bug. If we delete those, the bug disappears. Of course, we get a lot of array-related regressions instead.

McDutchie commented 2 years ago

If the value of the variable/element is set to the empty string as a side effect of this array processing, then that does make some sort of sense of why it makes exactly these expansions fail.

So, with still no deeper understanding of what's actually going on, the only thing I can think of is find a way to special-case out that assignment while processing ${var=val} or ${var?err}.

Unfortunately the call stack is pretty deep. If we replace line 1251 with an abort(), we get:

0   libsystem_kernel.dylib              0x00007fff6014f2c2 __pthread_kill + 10
1   libsystem_pthread.dylib             0x00007fff6020abf1 pthread_kill + 284
2   libsystem_c.dylib                   0x00007fff600b96a6 abort + 127
3   ksh                                 0x000000010be329f3 nv_putsub + 1411 (array.c:1255)
4   ksh                                 0x000000010be3461c nv_endsubscript + 940 (array.c:1547)
5   ksh                                 0x000000010be704fc nv_create + 4732 (name.c:1066)
6   ksh                                 0x000000010be6dacf nv_open + 1951 (name.c:1425)
7   ksh                                 0x000000010be64706 varsub + 4934 (macro.c:1322)
8   ksh                                 0x000000010be602dc copyto + 3308 (macro.c:625)
9   ksh                                 0x000000010be615b1 sh_macexpand + 801 (macro.c:240)
10  ksh                                 0x000000010be30d12 arg_expand + 338 (args.c:813)
11  ksh                                 0x000000010be309d3 sh_argbuild + 259 (args.c:660)
12  ksh                                 0x000000010be95372 sh_exec + 946 (xec.c:976)
13  ksh                                 0x000000010be9998e sh_exec + 18894 (xec.c:2077)
14  ksh                                 0x000000010be2066f exfile + 3247 (main.c:607)
15  ksh                                 0x000000010be1f66f sh_main + 3551 (main.c:368)
16  ksh                                 0x000000010be04c46 main + 38 (pmain.c:45)
17  libdyld.dylib                       0x00007fff600143d5 start + 1

The special-casing needs to be done on line 1250 of array.c, but flagged in varsub() which processes these expansions. So, varsub() calls nv_open() calls nv(create() calls nv_endsubscript() calls nv_putsub(). Passing a flag as an argument does not seem very feasible.

So the only thing I can think of is introduce yet another global flag for a very special case. It's ugly, but it does seem to work. An elegant fix would probably involve a fairly comprehensive redesign, which is simply not going to happen.

Please try this patch. ```diff diff --git a/src/cmd/ksh93/include/shell.h b/src/cmd/ksh93/include/shell.h index ca7cf8265..3de9a7209 100644 --- a/src/cmd/ksh93/include/shell.h +++ b/src/cmd/ksh93/include/shell.h @@ -354,6 +354,7 @@ struct Shell_s void *defpathlist; void *cdpathlist; char **argaddr; + char cond_expan; /* set while processing ${var=val}, ${var:=val}, ${var?err}, ${var:?err} */ void *optlist; struct sh_scoped global; struct checkpt checkbase; diff --git a/src/cmd/ksh93/sh/array.c b/src/cmd/ksh93/sh/array.c index a848be72c..4281c1eba 100644 --- a/src/cmd/ksh93/sh/array.c +++ b/src/cmd/ksh93/sh/array.c @@ -1247,7 +1247,7 @@ Namval_t *nv_putsub(Namval_t *np,register char *sp,register long mode) nv_arraychild(np,mp,0); nv_setvtree(mp); } - else + else if(!sh.cond_expan) ap->val[size].cp = Empty; if(!sp && !array_covered(np,ap)) ap->header.nelem++; diff --git a/src/cmd/ksh93/sh/macro.c b/src/cmd/ksh93/sh/macro.c index 3f04e415b..69c5bad82 100644 --- a/src/cmd/ksh93/sh/macro.c +++ b/src/cmd/ksh93/sh/macro.c @@ -1301,10 +1301,13 @@ retry1: flag |= NV_NOASSIGN|NV_VARNAME|NV_NOADD; if(c=='=' || c=='?' || (c==':' && ((d=fcpeek(0))=='=' || d=='?'))) { + sh.cond_expan = 1; if(c=='=' || (c==':' && d=='=')) flag |= NV_ASSIGN; flag &= ~NV_NOADD; } + else + sh.cond_expan = 0; #if SHOPT_FILESCAN if(sh.cur_line && *id=='R' && strcmp(id,"REPLY")==0) { ```