wryun / es-shell

es: a shell with higher-order functions
http://wryun.github.io/es-shell/
Other
307 stars 25 forks source link

`es -cfoo` is memory-buggy #78

Closed jpco closed 6 months ago

jpco commented 8 months ago

It seems that esopt() parses es -cfoo the same as es -c foo, but without the space, the foo doesn't get Ref()d, so if a GC gets triggered early enough, it gets mangled. This first manifested on my machine as a weird 487: No such file or directory, but with -DGCDEBUG=1, it crashes with a segfault. In my case this was triggered during initenv() -- I assume it requires a sufficiently hairy environment to trigger a GC there.

I assumed this was something I screwed up with #61, but actually the old getopt() version is even worse, running into this with both es -cfoo and es -c foo!

This is definitely the wrong behavior, but I'm not 100% confident what the right behavior should be. We want, for example, either

  1. es -cx 'echo ok' is the same as es -xc 'echo ok', running echo ok with the -x option, or
  2. es -cx 'echo ok' tries to run the command x with $1 set to 'echo ok'.

For what it's worth, option (1) gets my vote.

memreflect commented 8 months ago

There are actually three options, not two.

(1) Modern POSIX sh treats -c as a flag with no argument and the first non-flag argument acts as a command and stops argument parsing, with the next argument being used as the script name/path (i.e. $0) and subsequent arguments being used as the argument list; this is the only form of invocation that allows you to set $0, but it's worth mentioning for completeness:

sh -cx command script-name arg1 arg2 ...
  => trace_on, execute 'command' with $0='script-name' and $@=(arg1 arg2 ...)
sh -c command -x arg1 arg2
  =>           execute 'command' with $0='-x' and $@=(arg1 arg2 ...)

Elvish does the same with the exception that you cannot set its equivalent of $0 this way — all arguments after the command are arguments to that command, including any flags.

This corresponds to your option 1.

(2) The sh on at least FreeBSD and OpenBSD as well as fish, es (bugs aside) and the rc that comes with plan9port all treat -c as a flag with an argument specifying a command to execute, meaning subsequent arguments after the command including flags like -x are accepted, with the first non-flag (or the first argument after --) being treated as the start of the argument list; there is no way to set the script name/path (i.e. $0):

9 rc -cx arg1 arg2 ...
  =>           execute 'x' with $*=(arg1 arg2 ...)
9 rc -c command -x arg1 arg2
  => trace_on, execute 'command' with $*=(arg1 arg2 ...)

This corresponds to your option 2, i think?

(3) Then you have the port of rc originally authored by Byron Rakitzis, which treats the argument to -c as a command and any subsequent arguments after that as script arguments ($*) without even looking for additional flags:

rc -cx arg1 arg2 ...
  =>           execute 'x' with $*=(arg1 arg2 ...) unused
rc -c command -x arg2
  =>           execute 'command' with $*=(-x arg2 ...)

I like options (1) and (2) described here, with a slight preference for (2) because a command is required with the use of -c anyway, meaning es -c by itself has no purpose, so you might as well allow the option parser to handle a missing argument to -c. The behavior of (3) seems wrong to me, but it has precedent in commands like xterm -e cmd --foo arg1 arg2 ..., so i felt it was at least worth acknowledging.

jpco commented 6 months ago

That's a good point -- I guess there are actually a few behaviors going on.

  1. Does an argv containing -ca b look for a as the command, or b?
  2. Does the -c flag let option parsing continue or not?
  3. Does -c set $0 in addition to $1 $2 ... or not?

My option (1) was "b, yes, no". Option 2 was "a" but didn't answer (2) or (3).

I think your options are, in order, "b, yes, yes", "a, yes, no", and "a, no, no". (I might be wrong, but I think it's largely immaterial).

Current es behavior (modulo bugs) is "a, yes, no". What I prefer is still "b, yes, no", and when I filed this issue I believed that both were equally valid given es -ca was broken. However, there's already a valid use of esoptarg() as it is in the shell -- in the $&access primitive! So there's at least one working esoptarg() behavior that would change if we adopted "b, yes, no".

I would rather not impose such a change on everyone (or make es getopt behavior more inconsistent than it currently is). So instead I'll just make the current "a, yes, no" behavior work in main() without crashing.