python / cpython

The Python programming language
https://www.python.org
Other
63.16k stars 30.25k forks source link

[argparse] Bad error message formatting when using custom usage text #86463

Open c5a210b0-9bc4-41d4-88b1-57b4b853ea6e opened 3 years ago

c5a210b0-9bc4-41d4-88b1-57b4b853ea6e commented 3 years ago
BPO 42297
Nosy @rhettinger, @TurboTurtle, @bobblanchett

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', '3.8', 'type-bug', 'library'] title = '[argparse] Bad error message formatting when using custom usage text' updated_at = user = 'https://github.com/TurboTurtle' ``` bugs.python.org fields: ```python activity = actor = 'bobblanchett' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'TurboTurtle' dependencies = [] files = [] hgrepos = [] issue_num = 42297 keywords = [] message_count = 7.0 messages = ['380598', '380611', '380612', '380615', '380616', '380619', '380701'] nosy_count = 4.0 nosy_names = ['rhettinger', 'paul.j3', 'TurboTurtle', 'bobblanchett'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue42297' versions = ['Python 3.6', 'Python 3.7', 'Python 3.8'] ```

Linked PRs

c5a210b0-9bc4-41d4-88b1-57b4b853ea6e commented 3 years ago

In the sos project, we build a custom usage string for our argparser parser, and have noticed that doing so causes error messages from argparse to be badly formatted.

For example if a bad option value is given, the error message is mangled into the last line of our usage string:

# python3 bin/sos report --all-logs=on
usage: sos report [options]
sos <component> [options]

[..snip...]
    collect, collector            Collect an sos report from multiple nodes simultaneously report: error: argument --all-logs: ignored explicit argument 'on'

This is especially strange since we build the usage string with a trailing newline character:

        for com in self._components:
            aliases = self._components[com][1]
            aliases.insert(0, com)
            _com = ', '.join(aliases)
            desc = self._components[com][0].desc
            _com_string += (
                "\t{com:<30}{desc}\n".format(com=_com, desc=desc)
            )
        usage_string = ("%(prog)s <component> [options]\n\n"
                        "Available components:\n")
        usage_string = usage_string + _com_string
        epilog = ("See `sos <component> --help` for more information")
        self.parser = ArgumentParser(usage=usage_string, epilog=epilog)

So it appears the trailing newlines are being stripped (in our case, unintentionally?). As expected, removing the trailing newline when passing usage_string to our parse does not change this behavior.

However, if we don't set the usage string at all when instantiating our parser, the error message is properly formatted beginning on a new line. Slightly interesting is that without the usage_string being passed, the error message is prefixed with "sos: report:" as expected for %(prog)s expansion, but when the error message is mangled %(prog)s is left out as well.

A little more context is available here: https://github.com/sosreport/sos/issues/2285

7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 3 years ago

Provide a minimal reproducible example. I can't reproduce that run on error message.

Also test with arguments like '--all-logs on', which issues an 'unrecognizeable argument' error (with a different error reporting path).

Stripping excess newlines is normal, both in the full help and error. That's done at the end of help formatting.

c5a210b0-9bc4-41d4-88b1-57b4b853ea6e commented 3 years ago

I'll try and get a simple reproducer made shortly, however as a quick note I've found that using '--all-logs on' results in a properly formatted error message.

c5a210b0-9bc4-41d4-88b1-57b4b853ea6e commented 3 years ago

Ah, ok - so I neglected to mention we're using subparsers which appears to be relevant here. My apologies.

Here's a minimal reproducer that shows the behavior when using './arg_test.py foo --bar=on'

#! /bin/python3

import argparse

usage_string = 'test usage string ending in newlines\n\n'
sub_cmd_usage = ''

for i in range(0, 3):
    sub_cmd_usage += '\tfoo  --bar\n'

usage_string += sub_cmd_usage

parser = argparse.ArgumentParser(usage=usage_string)
subparser = parser.add_subparsers(dest='subcmd', metavar='subcmd')
subcmd_parser = subparser.add_parser('foo')
subcmd_parser.add_argument('--bar', action="store_true", default=False)

if __name__ == '__main__':
    args = parser.parse_args()
7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 3 years ago

It's the subparser that's producing this error, specifically its 'prog' attribute.

If I use a custom usage with a simple parser:

1129:~/mypy$ python3 bpo-42297.py --foo=on
usage: bpo-42297.py
one
two
three
bpo-42297.py: error: argument --foo: ignored explicit argument 
'on'

Notice that the error line includes the 'prog'.

With subparsers, the main usage is included in the subcommand prog:

    print(subcmd_parser.prog)

produces:

test usage string ending in newlines

foo  --bar
foo  --bar
foo  --bar foo

That's the usage plus the subcommand name, 'foo'.

Generating the explicit error in the subcommand:

1244:~/mypy$ python3 bpo-42297.py foo --bar=on
test usage string ending in newlines

foo  --bar
foo  --bar
foo  --bar foo: error: argument --bar: ignored explicit 
argument 'on'

'bpo-42297.py: ' has been replaced by the usage+'foo', and no newline.

We don't see this in the 'unrecognized' case because that error issued by the main parser.

bpo-42297.py: error: unrecognized arguments: on

If I explicitly set the prog of the subcommand:

    subcmd_parser = subparser.add_parser('foo', prog='myscript foo')

The error becomes:

1256:~/mypy$ python3 bpo-42297.py foo --bar=on
usage: myscript foo [-h] [--bar]
myscript foo: error: argument --bar: ignored explicit argument 'on'

I can also add 'usage=usage_string' to the add_parser. For the most part add_parser takes the same parameters as ArgumentParser.

Alternatively we can specify prog in

    subparser = parser.add_subparsers(dest='subcmd', metavar='subcmd', prog='myscript')

resulting in:

myscript foo: error: argument --bar: ignored explicit argument 'on'

I recently explored how 'prog' is set with subparsers in

https://bugs.python.org/issue41980

I don't think anything needs to be corrected in argparse. There are enough options for setting prog and usage in subcommands to get around this issue.

In the worse case, you might want to create an alternative

_SubParsersAction

Action subclass that defines the prog/usage differently.

c5a210b0-9bc4-41d4-88b1-57b4b853ea6e commented 3 years ago

Ok, yeah there seem to be several paths to avoid this behavior then. We should be fine exploring those options.

Thanks for the pointer!

7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 3 years ago

We could look into using a different more compact 'prog' for these error messages.

Currently the subparser 'prog' uses the main prog plus positionals plus the subparser name. The goal is to construct a subparser usage that reflects required input.

This is fine for the usage line, but could be too verbose for some errors. In an error, the 'prog' just helps identify which argument has problems, and doesn't need the extra usage information. Most of the time that isn't an issue, since we don't use positional much in the main parser (and when used can't have variable nargs).

But I don't have immediate ideas as to what can be conveniently (and safely) changed.

serhiy-storchaka commented 4 weeks ago
        # prog defaults to the usage message of this parser, skipping
        # optional arguments and with no "usage:" prefix
        if kwargs.get('prog') is None:
            formatter = self._get_formatter()
            positionals = self._get_positional_actions()
            groups = self._mutually_exclusive_groups
            formatter.add_usage(self.usage, positionals, groups, '')
            kwargs['prog'] = formatter.format_help().strip()

This code only makes sense if the custom usage is None. Then automatically generated message includes prog and names of positional arguments preceding subparsers. add_parser() creates prog for the subparser by concatenating that value with the name of the subparser. This is exactly the command used to invoke the subparser (ignoring options).

The custom usage replaces the full usage line. If it is close to the generated usage, it includes options, a set of all subparser names, "..." for subparser arguments. E.g. PROG [-h] [-v] {a,b,c,d,e} .... If we add the name of the concrete parser we will get something very from the command used to invoke the subparser.

I think that using full custom usage instead of specially formatted partial usage string is wrong. What should we use instead? I think that falling back to the generated usage would be wrong too -- there is a reason why the custom usage was specified at first place. Then we can just set _prog_prefix to None and use subparser's name as its prog. It will produce less ugly result.

serhiy-storchaka commented 4 weeks ago

124532 makes '...' to be used as the default prog prefix if custom usage is used.

serhiy-storchaka commented 14 hours ago

On other hand, using the generated prog prefix as if the custom usage was not specified looks not bad.

125891 ignores custom usage when generating the default prog prefix.