ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
188 stars 31 forks source link

Alias expansion loop not detected in command substitution #458

Open ormaaj opened 2 years ago

ormaaj commented 2 years ago

Just a silly quirk.

$ ksh -c $'alias @=\'echo test >&2; )$( @ \'\n$( @ )' 2>&1 | head -n3
test
test
test

Arguably, it's not a bug, but only ksh does this. I found this ages ago while looking at similar issues in bash. I noticed the recent alias expansion discussion on bug-bash and thought I'd mention this.

McDutchie commented 2 years ago

Warning: this leaves a ksh process using 100% CPU even after interrupting it.

More readable reproducer:

alias foo='echo test >&2; )$( foo '
$( foo )

That's pretty clever. It's basically like a fork bomb, except with non-forking subshells.

Aliases should not expand recursively, so I think this is a bug. But it's low priority.

I think this happens because command substitutions are stored in the parse tree as unparsed source code, then parsed at runtime in a whole new context. That's something that may change in a distant future version, if I manage to figure out how. But that would be why it cannot detect this recursion.

McDutchie commented 2 years ago

A simpler reproducer:

alias foo='echo $(foo)'
foo

Running that script causes an immediate memory fault. The cause is a stack overflow due to infinite recursion, which should not be happening.

The stack trace shows the recursion (ignore #0, that's just where the stack happened to overflow that time):

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   ksh                             0x000000010e4f8b3b nv_getval + 571 (name.c:2780)
1   ksh                             0x000000010e4e49f8 sh_macexpand + 152 (macro.c:202)
2   ksh                             0x000000010e4b3b62 arg_expand + 338 (args.c:813)
3   ksh                             0x000000010e4b3823 sh_argbuild + 259 (args.c:660)
4   ksh                             0x000000010e519262 sh_exec + 946 (xec.c:976)
5   ksh                             0x000000010e514f5d sh_subshell + 2845 (subshell.c:663)
6   ksh                             0x000000010e4e6486 comsubst + 2678 (macro.c:2188)
7   ksh                             0x000000010e4e6feb varsub + 1371 (macro.c:1180)
8   ksh                             0x000000010e4e39ac copyto + 3308 (macro.c:621)
9   ksh                             0x000000010e4e4c81 sh_macexpand + 801 (macro.c:236)
10  ksh                             0x000000010e4b3b62 arg_expand + 338 (args.c:813)
11  ksh                             0x000000010e4b3823 sh_argbuild + 259 (args.c:660)
12  ksh                             0x000000010e519262 sh_exec + 946 (xec.c:976)
13  ksh                             0x000000010e514f5d sh_subshell + 2845 (subshell.c:663)
14  ksh                             0x000000010e4e6486 comsubst + 2678 (macro.c:2188)
15  ksh                             0x000000010e4e6feb varsub + 1371 (macro.c:1180)
16  ksh                             0x000000010e4e39ac copyto + 3308 (macro.c:621)
17  ksh                             0x000000010e4e4c81 sh_macexpand + 801 (macro.c:236)
18  ksh                             0x000000010e4b3b62 arg_expand + 338 (args.c:813)
19  ksh                             0x000000010e4b3823 sh_argbuild + 259 (args.c:660)
20  ksh                             0x000000010e519262 sh_exec + 946 (xec.c:976)
21  ksh                             0x000000010e514f5d sh_subshell + 2845 (subshell.c:663)
22  ksh                             0x000000010e4e6486 comsubst + 2678 (macro.c:2188)
23  ksh                             0x000000010e4e6feb varsub + 1371 (macro.c:1180)
24  ksh                             0x000000010e4e39ac copyto + 3308 (macro.c:621)
25  ksh                             0x000000010e4e4c81 sh_macexpand + 801 (macro.c:236)
[...many repetitions snipped...]
498 ksh                             0x000000010e4b3b62 arg_expand + 338 (args.c:813)
499 ksh                             0x000000010e4b3823 sh_argbuild + 259 (args.c:660)
500 ksh                             0x000000010e519262 sh_exec + 946 (xec.c:976)
501 ksh                             0x000000010e514f5d sh_subshell + 2845 (subshell.c:663)
502 ksh                             0x000000010e4e6486 comsubst + 2678 (macro.c:2188)
503 ksh                             0x000000010e4e6feb varsub + 1371 (macro.c:1180)
504 ksh                             0x000000010e4e39ac copyto + 3308 (macro.c:621)
505 ksh                             0x000000010e4e4c81 sh_macexpand + 801 (macro.c:236)
506 ksh                             0x000000010e4b3b62 arg_expand + 338 (args.c:813)
507 ksh                             0x000000010e4b3823 sh_argbuild + 259 (args.c:660)
508 ksh                             0x000000010e519262 sh_exec + 946 (xec.c:976)
509 ksh                             0x000000010e514f5d sh_subshell + 2845 (subshell.c:663)
510 ksh                             0x000000010e4e6486 comsubst + 2678 (macro.c:2188)
511 ksh                             0x000000010e4e6feb varsub + 1371 (macro.c:1180)
ormaaj commented 2 years ago

Aliases should not expand recursively, so I think this is a bug.

They should in normal cases. These examples are valid recursive aliases:

But it's low priority.

Oh yeah I was gonna suggest the same.

Frankly I'm quite happy aliases are as loosely defined as they are. Imagine trying to implement runtime alias expansion in a JIT or even ahead-of-time compiled implementation. It would require heavy reflection use and make many optimizations impossible. As a macro mechanism it's pretty crap and should eventually be replaced with something better.

Edit: yep that's a better title. :)

Er... I guess I forgot what I was trying to do with that second example. That might be unfinished. The first makes more sense and doesn't really even use the eval. It just exploits an alias with trailing space to build a word list. Not quite the same as this.

McDutchie commented 2 years ago

You call those normal cases? Wow.

I'm really not smart enough to understand those. But you say at least the first one makes sense, so I've tried to decipher it and reduce it to its essence. I've come up with:

n=0 max=${1:-500}
alias \
  @n='test "$((n += 1))" -ge "$max" &&' \
  @count='printf "$n "; @n alias @n="" @count="printf \$(((n -= 1) + 1))\ ;"; eval @count @count'
eval @count echo

On all shells this counts to 500 and back (though bash needs shopt -s expand_aliases).

My impression is that you found a very clever way to cheat using eval. I still don't understand exactly how this works, but in my view, this is about as far from a normal case of alias expansion as it's possible to get.

A normal case of attempted recursive alias expansion would be:

alias echo='echo '
echo hi

This simply outputs 'hi' as expected on all shells, ksh93 included. So clearly it does loop detection or there would be an infinite recursive alias expansion loop as with the command substitution reproducer earlier.

POSIX confirms it:

To prevent infinite loops in recursive aliasing, if the shell is not currently processing an alias of the same name, the word shall be replaced by the value of the alias; otherwise, it shall not be replaced.

The bug is that, due to the strange way ksh parses command substitutions separately at runtime, it is in fact not processing an alias, so the loop detection cannot work.

ormaaj commented 2 years ago

My impression is that you found a very clever way to cheat using eval. I still don't understand exactly how this works, but in my view, this is about as far from a normal case of alias expansion as it's possible to get.

The eval in that case only functions to force evaluation of an alias that was defined on the same line. The recursion mechanism is entirely due to alias expansion.

A normal case of attempted recursive alias expansion would be:

alias echo='echo '
echo hi

This simply outputs 'hi' as expected on all shells, ksh93 included. So clearly it does loop detection or there would be an infinite recursive alias expansion loop as with the command substitution reproducer earlier.

Direct recursion in that manner is prevented. To understand the example, consider aliases that generate additional alias words. One instance of an alias is never expanded twice but an alias that expands the following word to another alias can be recursively expanded indirectly.

In fact the counter example is just the simplest case of a flat word list traversed from start to finish. An alias expanding multiple words can effectively build a hierarchical tree of word lists that is implicitly traversed through expansion of subsequent alias words and potentially re-interpolation of preceding alias words higher up in the hierarchy. The details of how that works is where most of the implementation-specific quirks exist and the fact that it exists is implicit due to alias expansion occurring as the AST is being built alongside tokenization. Some shells flatten the tree as the expansion progresses and some do it in stages. That becomes apparent when you look at what happens when an alias expands to a quote or escape character that affects other words in their context. Every implementation does that a bit differently. It's all down to how the tokenizer is implemented and the way it interacts with the alias expansion step.

It's an ugly mechanism. I've used it to implement shell functions with custom behavior from scratch using a trampoline that evaluates purely through recursive alias expansion.

POSIX confirms it:

To prevent infinite loops in recursive aliasing, if the shell is not currently processing an alias of the same name, the word shall be replaced by the value of the alias; otherwise, it shall not be replaced.

Yeah POSIX doesn't describe any of that. https://www.austingroupbugs.net/view.php?id=953

The bug is that, due to the strange way ksh parses command substitutions separately at runtime, it is in fact not processing an alias, so the loop detection cannot work.

Right. All of the above is unrelated.

phidebian commented 2 years ago

What is the conclusion, does the recursion considered legit so the example starting with

n=0 max=${1:-500}

I am asking because the example above with high upper limit would core dump either.

 n=0 max=${1:-10000} 
...
 eval @count echo 
....
2686 2687 2688 Segmentation fault (core dumped)

So is this problem simply a stack limit detection ?

ormaaj commented 2 years ago

@phidebian That's totally a side-discussion, nothing to do with this bug. That part of the expression is just plain eval recursion. Shells aren't smart enough to flatten out commands like a tail call, and for some reason it's required to trigger the alias expansion in this case. You can tell that's what's happening because the PS4's are piling up in the xtrace output.

The actual alias expansion part is "flattened" when it expands the long chain of aliases, but the eval's were used to build that chain and you're deep into the eval stack when it occurs.

A no-eval variation on this is possible, but in that case you really do need to generate alias names to dodge the recursive alias expansion fun ruiner. Fun is disallowed by POSIX.

phidebian commented 2 years ago

I lost sight of the discution. I was not talking at trying to solve (or flatten) a tail recursion, I was more on the core dump side. Do we prefere a nice error and recovery allow scripting to continue or avoid loosing work on interactive, or are we satisfied with core dump. In the latter the fix is simple, I.e do nothing, on the former, some strategy can be use to throw an error and terminate the recursion, I.e something along the lines of shell function recursion that decide to stop the recursion after some deep level reached.

ormaaj commented 2 years ago

Nah every shell just crashes on that. The proper "fix" for performing an error recovery really is to do some sort of TCO, whether or not to build in an artificial restriction is a separate issue.

ksh has some theoretical advantages in that regard because it's a bytecode evaluator. In principle either an optiization or error handling / recovery could be done equally at that level.

The pure interpreters can theoretically solve this too, but have to go further out of their way to squash the interpreter's internal call stack.

One bodge is to build an interpreter with an alternative stack scheme such as gcc's -fsplit-stack. That isn't a real solution it just means you can grow the stack until you run out of memory (arguably worse). Another method is to incorporate something like libsigsegv (GPL, but whatever. There are probably alternatives. I don't know of any projects using it.)

EDIT: yeah you could do it the lame way too and build in a counter to make it fail like bash's FUNCNEST does with function calls. Not worth it IMO. You could even try hacking around it with the ksh debug features (.sh.fun / .sh.level) and an eval wrapper function, though calling a function produces a few side-effects.

phidebian commented 2 years ago

Ha this is an interesting discussion :-)

On Mon, Oct 17, 2022 at 10:05 PM Daniel Douglas @.***> wrote:

Nah every shell just crashes on that. The proper "fix" for performing an error recovery really is to do some sort of TCO, whether or not to build in an artificial restriction is a separate issue.

That was my point, :-), and I agry I may have open another discussion thread instead of cockoo nest in this one. My be I should open a discution and cut/paste this discussion there, let me know.

Every shell just crash you said, but sometime they don't for instance the shell function recursion don't and RE long input does, as well as this alias/eval trick and we can find much more.

ksh has some theoretical advantages in that regard because it's a bytecode

evaluator. In principle either an optiization or error handling / recovery could be done equally at that level.

Right, but the amount of comment in the commit log will be consequent :)

The pure interpreters can theoretically solve this too, but have to go further out of their way to squash the interpreter's internal call stack.

One bodge is to build an interpreter with an alternative stack scheme such as gcc's -fsplit-stack. That isn't a real solution it just means you can grow the stack until you run out of memory. Another method is to incorporate something like libsigsegv https://www.gnu.org/software/libsigsegv/ (GPL, but whatever. There are probably alternatives. I don't know of any projects using it.)

The shells, most of them, already have a 'split stack' the C call stack, and the stk kind (possibly malloc'ed and splitted again)

May be you talk about gcc split-stack (may be clang segmented stack), those are perf hog, and don't catch runaway recursion, the crash occurs way later at VAS exaustion (compared to stack exhaustion). If I could suggest, a shell should never be compiled with split stack, this this is more for multi-threaded need.

I let the modern architecture of dual stack (security) on the side, it is not relevant here.

So stack recursion exhaution (split stack VAS exhaustion, or stack exhaution both come to a limit, then what is the desired behavior for the shell? Some code path nicely handle it (shell function call recursion) and some core dump, or we try to catch them all and handle them all canonically. This was my simple question.

If the answer is let it crash, then there is no point to fix alias/eval tricks, nor long input string on RE, and more generally, close up front any issue with ledgitimate recursion. If the answer is, error recovery (ala function call) is desirable then a stack deep check can be done, not hard to do, many strategy, the easiest one being the fixed deep, like the one used in sh_funscope()

path.h 45 #define MAXDEPTH (sizeof(char *)==2?64:1024) xec.c sh_funscope 3097 if(sh.fn_depth >= MAXDEPTH)

Other solutions exist based on some assumtion about the stack knowledge, for instance src/cmd/ksh93/include/ulimit.h

include "FEATURE/rlimits"

if defined(_sys_resource) && defined(_lib_getrlimit)

ulimit.h 33 #define RLIMIT_STACK (LIM_STACK-1)

Other 'feature' include /proc probing at init() time, this is costless on OS having it, and pretty accurate regarding stack size, os not having it, can use rlimit with heuristic, and the one with nothing has no resort but core dump, all this is doable.

In fact the libsigsegv(), or worst OS assisted VA addr chk are the last to considereh

Message ID: @.***>