docopt / docopts

Shell interpreter for docopt, the command-line interface description language.
Other
494 stars 53 forks source link

Do not try to mangle double dash #52

Closed DanaLacoste closed 2 years ago

DanaLacoste commented 2 years ago

TL;DR

This is a really minor fix to get around an issue we found when switching from python to go docopts in the handling of double-dash. It just removes the double-dash from the publish-to-bash list.

Description

When you have a double dash in your options (i.e. I have a terraform wrapper that looks like terraform --platform=<platform> <command> [--] [<terraform_options>...]) then the [--] gets run through Name_mangle and you end up with this error:

docopts:error: Print_bash_global:not supported

This is not necessarily pretty, but it just removes the Name_mangle call from Print_bash_global when the key is -- so as to avoid this entirely. It still handles things properly (terraform_options still contains the right result, in my example) it just doesn't do anything with that one command line flag (as it was intended for docopts only anyways, not for the following bash script)

Sylvain303 commented 2 years ago

hi @DanaLacoste

Thanks for this contribution. I will look at it. Not immediately though. :wink:

If I'm remembering well, I left Name_mangle algorithm intact in order to be compatible with older python version of docopts. Which was back in 2015, so this compatibility could now probably be removed. I've to go deeper in your changes and run all test to validate this change. Plus change the documentation accordingly.

The work around was to use another output mechanism that don't mangle name at all, like -A.

Could you drop some shell example, please? So I can test your use case and add a unittest for that change.

Regards, Sylvain.

DanaLacoste commented 2 years ago

No problem! This isn't a trivial case, so it is good to have a clear test/fix!

Here are my tests:

Setup: I did this all in a docker container to verify no system issues were involved in the test. Container spec:

FROM debian:buster

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
    unzip \
    wget \
    python3-setuptools \
    python3-pip \
    vim

CMD sleep infinity

Test script:

#!/bin/bash

docopts --debug -h - : "$@" <<EOF
Test for docopts double-dash handling

Usage:
  $0 --platform=<platform> [options] [--] [<unparsed_options>...]
  $0 -h | --help

Options:
  -p --platform=<platform> Platform to configure
  --trace                  Full trace output from bash
EOF

Test case 1 : Use 0.6.3rc2 linux binary:

# wget https://github.com/docopt/docopts/releases/download/v0.6.3-rc2/docopts_linux_amd64
# mv docopts_linux_amd64 /usr/bin/docopts ; chmod a+x /usr/bin/docopts
# ./test-docopts.sh -p platform
################## golang ##################
             --debug : true
              --help : -
        --no-declare : false
           --no-help : false
         --no-mangle : false
     --options-first : false
         --separator : ----
           --version : <nil>
                  -A : <nil>
                  -G : <nil>
                   : : true
              <argv> : [-p platform]
                 doc : Test for docopts double-dash handling

Usage:
  ./test-docopts.sh --platform=<platform> [options] [--] [<unparsed_options>...]
  ./test-docopts.sh -h | --help

Options:
  -p --platform=<platform> Platform to configure
  --trace                  Full trace output from bash
        bash_version :
################## bash ##################
                  -- : false
              --help : false
          --platform : platform
             --trace : false
                  -h : false
  <unparsed_options> : []
----------------------------------------
docopts:error: Print_bash_global:not supported

Test case 2: Use same docopts with my PR (ok, I cheated and ran this on my desktop, but it's go! Same thing everywhere! :) )

% ./test-docopts.sh -p test -- -more
################## golang ##################
             --debug : true
              --help : -
        --no-declare : false
           --no-help : false
         --no-mangle : false
     --options-first : false
         --separator : ----
           --version : <nil>
                  -A : <nil>
                  -G : <nil>
                   : : true
              <argv> : [-p test -- -more]
                 doc : Test for docopts double-dash handling

Usage:
  ./test-docopts.sh --platform=<platform> [options] [--] [<unparsed_options>...]
  ./test-docopts.sh -h | --help

Options:
  -p --platform=<platform> Platform to configure
  --trace                  Full trace output from bash
        bash_version :
################## bash ##################
                  -- : true
              --help : false
          --platform : test
             --trace : false
                  -h : false
  <unparsed_options> : [-more]
----------------------------------------
unparsed_options=('-more')
h=false
help=false
platform='test'
trace=false

Test case 3: use python docopts (whatever it is that pip3 install will use. only change was had to remove the --debug flag from the test script as it is not supported)

# pip3 install docopts
Collecting docopts
  Downloading https://files.pythonhosted.org/packages/ba/b3/e96eebbc507ce1db0d90a139ae9af8479a1d8f87da39330a83ab204e842f/docopts-0.6.1.tar.gz

<pip output deleted for clarity>

# ./test-docopts.sh -p test
platform='test'
trace=false
unparsed_options=()
h=false
help=false
# ./test-docopts.sh -p test -- -unused
platform='test'
trace=false
unparsed_options=('-unused')
h=false
help=false
DanaLacoste commented 2 years ago

How I wrote the PR change:

The implementation of the PR is, well, a "if !=" wrapped around the body of a big for loop is not elegant. But.... it is the least intrusive change I could think of for short notice (and, AFAICT, it is correct)

NOTE: Oh, but what about - then? Why did this only cover the -- on line 241 and not the - case? Because, from Bash's perspective, the bash script does need to know that the caller is requesting that input be read from stdin and not a file. A single - is not a directive to docopts, but to the resulting bash script. So while that would remain an open issue for docopts to figure out a permanent solution, well, I am not using - as a parameter right now and so I didn't have a good feeling for "what would I, as a user, like to see happen for this use case?" (vs. the -- which I really need to work right :) )

Thanks!

Sylvain303 commented 2 years ago

Very complete thanks. :+1: To be more complete about [--] I also link this issue: https://github.com/docopt/docopts/issues/49

Sylvain303 commented 2 years ago

Hi @DanaLacoste

I was busy on an other FreeSoftware project. Sorry to take so much time to answer. My head was busy coding something else and I've to refresh my go on this project to be efficient.

So, I made a new branch with the content you gave me in this PR pr-52-double-dash-no-mangle.

https://github.com/docopt/docopts/tree/pr-52-double-dash-no-mangle/bug/double_dash_mangle

So we could work on the exact same code. It's really basic and just build and run the docker container actually.

I modified the Dockerfile, to embed everything you did + a golang development env too. :smiley:

I will continue to study and work on your PR to see what it can be and how it should be integrated.

Actually it doesn't compile, though.

DanaLacoste commented 2 years ago

Howdy! I apparently totally messed up what I committed from what I built: The small change 83d281f fixes that. (I believe I made a mistake in transcribing from my work system (which doesn't talk to github.com) and my home system (where I uploaded the change so I could open the PR)

Sylvain303 commented 2 years ago

Howdy! I apparently totally messed up what I committed from what I built: The small change 83d281f fixes that.

That compiles, thanks.

(I believe I made a mistake in transcribing from my work system (which doesn't talk to github.com) and my home system (where I uploaded the change so I could open the PR)

My advise is to always copy the whole file and let git diff show you the result. :wink:

I made some integration, to figure out what's going on with your PR proposal. Which breaks --no-mangle option behavior, by eating --.

By the way --no-mangle allows user to parse / mangle the parsed option as they want to. This avoid long time patching the mangling routine. :wink:

hum, I though I used --no-mangle by some shell scripts to parse the parsed result, but I can't find it in the repos. May I used -A and get the keys of the associative array.

I will reply your points list above, the next coding session. :wink:

Sylvain303 commented 2 years ago

Here's some explanation on what you've found:

  • Print_bash_global is walking through all of the params to figure out which (if any) need to be name mangled at docopts.go#L208

True, but not fully exact. As the function's name suggests, the goal of this to Print_bash_global print the output as bash's global variable. (I also reuse this function for --no-mangle as the algorithm is almost the same loop.) This function goals is not to figure out which needs to be mangle but to mangle blindly in order to output a valid evaluable bash code for bash global variables.

  • That calls Name_mangle, which knows not to mangle - and -- as neither of those would be valid bash variable names, no matter what you do, so it tags them with an error at docopts.go#L241

Name_mangle was took almost verbatim from python's original's code:

def name_mangle(elem)

Python Go
```python def name_mangle(elem): if elem == '-' or elem == '--': return None elif re.match(r'^<.*>$', elem): var = elem[1:-1] elif re.match(r'^-[^-]$', elem): var = elem[1] elif re.match(r'^--.+$', elem): var = elem[2:] else: var = elem var = var.replace('-', '_') if not isbashidentifier(var): raise ValueError(elem) else: return var ``` ```go func (d *Docopts) Name_mangle(elem string) (string, error) { var v string if elem == "-" || elem == "--" { return "", fmt.Errorf("not supported") } if Match(`^<.*>$`, elem) { v = elem[1:len(elem)-1] } else if Match(`^-[^-]$`, elem) { v = fmt.Sprintf("%c", elem[1]) } else if Match(`^--.+$`, elem) { v = elem[2:] } else { v = elem } // alter output if we have a prefix key_fmt := "%s" if d.Global_prefix != "" { key_fmt = fmt.Sprintf("%s_%%s", d.Global_prefix) } v = fmt.Sprintf(key_fmt, strings.Replace(v, "-", "_", -1)) if ! IsBashIdentifier(v) { return "", fmt.Errorf("cannot transform into a bash identifier: '%s' => '%s'", elem, v) } return v, nil } ```

a trick that the python's code did was: else block in try catch statement

    try:
        variables = dict(zip(map(name_mangle, args.keys()),
                             map(to_bash, args.values())))
    except ValueError as e:
        sys.exit("%s: name could not be mangled into a valid Bash "
                 "identifier: %s" % (sys.argv[0], e))
    else:
        variables.pop(None, None)
        args.pop('-', None)
        args.pop('--', None)

not simple to understand what's done here, some error may fails ValueError only, some (others) goes to the else: (sick, write once code :wink:)

So I would say Yes, it does the trick by removing - and --

  • Which makes sense, in the scope of Name_mangle: "you asked me to mangle something which has no valid equivalent", but the Print_bash_global treats any error returned from Name_mangle as a fatal error.

Which makes sense to fail for me, in the meaning of « printing a valid bash global » as mentioned. So fatal error :smiley: But the error message could have been more obvious:

Cannot mangle name suitable for bash eval: '--' 
Hint: -A as an alternative. 
  • But, logically, docopts double-dash is not a name: it is a directive to docopts and not to bash: it says "hey, don't try to parse the rest of the line as docopts params, instead pass them to a later script/tool/whatever". So... as a bash script, I don't actually need or want the -- : it has no meaning to bash, but only to docopts

Could be true, but not from actual perspective on docopt parser / language (please note docopt without S). See previous discussion in #49.

docopt lib says it doesn't handle double-dash -- specifically. As docopts uses the golang lib port of docopt lib actually, so it reproduces this behavior.

I've plan to rewrite the parser totally, but it's a huge job. Especially my goal, was to report better docopt language parsing handling and or human debugging and to fix some inconsistencies in options parsing.

  • So therefore let's just not pass that one, specific thing on to bash: just bypass it completely, if present

That's what we could expect effectively. (definitely? :wink: )

NOTE: Oh, but what about - then? Why did this only cover the -- on line 241 and not the - case? Because, from Bash's perspective, the bash script does need to know that the caller is requesting that input be read from stdin and not a file. A single - is not a directive to docopts, but to the resulting bash script. So while that would remain an open issue for docopts to figure out a permanent solution, well, I am not using - as a parameter right now and so I didn't have a good feeling for "what would I, as a user, like to see happen for this use case?" (vs. the -- which I really need to work right :) )

you mean in -G (global) mode right? (or without -A which is -G without prefix :thinking:)

https://github.com/docopt/docopts/blob/651ff2c07e6430da9a843ed7252ccd046f058d60/docopts.go#L69

So...

with -A it works

https://github.com/docopt/docopts/blob/651ff2c07e6430da9a843ed7252ccd046f058d60/examples/docopts_auto_example.sh

examples$ ./docopts_auto_example.sh cmd fileame_here -
FILE,# = 2
FILE,1 = -
FILE,0 = fileame_here
cmd = true

Works too in global mode (what I call -G):

./docopts  -h 'Usage: prog parse FILENAME' : parse -
parse=true
FILENAME='-'
Sylvain303 commented 2 years ago

Ho :open_mouth: , @DanaLacoste

I think I found what you were talking about handling -

# go local version I modified (code in the branch)
$ ./docopts -h 'Usage: dump [-]' : -
docopts:error: Print_bash_global:Mangling not supported for: '-'

$ docker run -ti debian-docpots 
# python original pip version installed in the container 
$ docopts --version
docopts 0.6.1
Copyright (C) 2013 Vladimir Keleshev, Lari Rasku.
License MIT <http://opensource.org/licenses/MIT>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

$ docopts -h 'Usage: dump [-]' : -
# no output filtered as no option (useless?)

# associative mode
$ docopts -A ARGS -h 'Usage: dump [-]' : -
declare -A ARGS
ARGS['-']=true

As I'm in favor of introducing more helpful error message, I think I gonna kept this incompatibility for - so docopts wont be 100% compatible. :disappointed:

docopt 2015 doc is saying:

[-]

A single dash "-", when not part of an option, is often used by convention to signify that a program should process stdin, as opposed to a file. If you want to follow this convention add "[-]" to your pattern. "-" by itself is just a normal command, which you can use with any meaning.

Which is parsed by docopt lib but can't be mangled without introducing special case.

For me it'd better handled as FILENAME parameter and documenting it in the help message.

So I modified the PR code:

$ ./docopts -G ARGS -h 'Usage: dump [-]' : -
ARGS__=true

That also means than double-dash could be mangled and not ignored in -G <prefix>...

which would output:

$ ./docopts -G ARGS -h 'Usage: dump [--]' : --
ARGS___=true

# but still skip it without prefix
```bash
$ ./docopts -h 'Usage: dump [--]' : --
# empty

What do you feel about the version of code?

DanaLacoste commented 2 years ago

OK, I read the diffs and it looks ok, but I would like to clarify the expectation, if that is OK, so that I can make sure I am understanding the intended behavior :)

Option 1 : Using bash v4 array format (-A) : This will work, because the limitation that name_mangle solves is not impacted. The -- will be passed as a param as array['--'] so everything is fine.

Option 2 : Using a prefix and global mode (-G) : This will work, because the code will skip this parameter and therefore everything will be fine (it won't complain about the -- format)

Option 3 : If you don't specify, you will get global mode by default, and the prefix will be empty, and everything will still be fine as this code will skip the -- (same as option 2)

If this is correct (I think it is, but I am only 75% sure :) ) then I am 👍 for the change: it will do the right thing and I won't have to retroactively update some scripts (we have some scripts that are, through coincidence, using the python version instead of the go version and while I want to update the environment to use the go version, I don't want to update the scripts, if I can at all avoid it :) )

Thanks very much! I know this wasn't a trivial one to describe, but I think we ended up at a good place, code-wise :)

Sylvain303 commented 2 years ago

Hi @DanaLacoste,

Option 1 : Using bash v4 array format (-A) : This will work, because the limitation that name_mangle solves is not impacted. The -- will be passed as a param as array['--'] so everything is fine.

Yes.

Option 2 : Using a prefix and global mode (-G) : This will work, because the code will skip this parameter and therefore everything will be fine (it won't complain about the -- format)

Not exactly, it doesn't skip it, it will mangle it with prefix: I modified -G it now mangles - or --:

./docopts -G ARGS -h "Usage: prog [-] [--]" : 
ARGS__=false
ARGS___=false

Which is ugly but it works. :wink:

Option 3 : If you don't specify, you will get global mode by default, and the prefix will be empty, and everything will still be fine as this code will skip the -- (same as option 2)

Yes, only -- is skipped in that case. - still raises an error. It was skipped by python's version but without giving the information that is was passed as argument. I improved error message too.

$ ./docopts -h "Usage: prog [-] [--]" : 
docopts:error: Print_bash_global:Mangling not supported for: '-'
$ ./docopts -h "Usage: prog [--] FILENAME" : - 
FILENAME='-'

it will do the right thing and I won't have to retroactively update some scripts (we have some scripts that are, through coincidence, using the python version instead of the go version and while I want to update the environment to use the go version, I don't want to update the scripts, if I can at all avoid it :) )

That's my goal with this version, to be as most compatible with old python's version behavior. I dropped compatibility for - because I think it removes information. But for -- it's OK to skip it entirely as the rest is parsed and outputted as expected.

Thanks very much! I know this wasn't a trivial one to describe, but I think we ended up at a good place, code-wise :)

If you think the README or the --help doesn't reflect or describe well enough the 3 modes behavior you've learned, tell me how I can improve it. I've made some rewrite of the README.md for this release.

I'm glad to fix those issues that was coming from corner case allowed in python's version that I missed.

Regards, Sylvain.