python / cpython

The Python programming language
https://www.python.org
Other
62.29k stars 29.93k forks source link

argparse.REMAINDER fails to parse remainder correctly #58382

Open ea3a5790-8aea-4291-a565-f273a1182dd3 opened 12 years ago

ea3a5790-8aea-4291-a565-f273a1182dd3 commented 12 years ago
BPO 14174
Nosy @jaraco, @merwok, @cjerdonek
Files
  • bug_argparse.py: Reproduction case
  • test.py
  • issue14174_1.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library'] title = 'argparse.REMAINDER fails to parse remainder correctly' updated_at = user = 'https://bugs.python.org/rr2do2' ``` bugs.python.org fields: ```python activity = actor = 'paul.j3' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'rr2do2' dependencies = [] files = ['24705', '28165', '35824'] hgrepos = [] issue_num = 14174 keywords = ['patch'] message_count = 10.0 messages = ['154761', '154929', '170921', '172232', '172235', '176691', '180753', '187204', '187206', '222074'] nosy_count = 7.0 nosy_names = ['jaraco', 'eric.araujo', 'chris.jerdonek', 'idank', 'paul.j3', 'rr2do2', 'Michael.Edwards'] pr_nums = [] priority = 'normal' resolution = None stage = 'needs patch' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue14174' versions = ['Python 2.7', 'Python 3.2', 'Python 3.3'] ```

    ea3a5790-8aea-4291-a565-f273a1182dd3 commented 12 years ago

    Reproduction case is attached and should speak for itself, but the short brief is that the argparse.REMAINDER behaviour is very inconsistent based on what (potentially) defined argument handlers come before it.

    Tested this on Python 2.7 on OS X, but also grabbed the latest argparse.py from hg to verify against this.

    merwok commented 12 years ago

    Thanks for the report. Could you edit your script to add the expected results, for example Namespace(foo=..., command=[...])?

    698a88e8-ff26-4994-9f86-79f7dc89f86c commented 11 years ago

    I just ran into this issue myself and worked around it by using parse_known_args*.

    jaraco commented 11 years ago

    I also ran into this problem. I put together this script to reproduce the issue:

    import argparse
    
    parser = argparse.ArgumentParser()
    parser.add_argument('app')
    parser.add_argument('--config')
    parser.add_argument('app_args', nargs=argparse.REMAINDER)
    args = parser.parse_args(['app', '--config', 'bar'])
    print vars(args)
    # actual: {'app': 'app', 'app_args': ['--config', 'bar'], 'config': None}
    # expected: {'app': 'app', 'app_args': [], 'config': 'bar'}

    I'll try using parse_known_args instead.

    698a88e8-ff26-4994-9f86-79f7dc89f86c commented 11 years ago

    Unfortunately parse_known_args is buggy too: http://bugs.python.org/issue16142

    b44778d9-a8e4-4f18-9ca0-8eab04ca3765 commented 11 years ago

    I'm attaching my own bug repro script for Eric. Is this sufficient? I can demonstrate the entire resulting Namespace, but the problem is that argparse doesn't even produce a Namespace. The cases I show simply fail.

    cjerdonek commented 11 years ago

    See also bpo-17050 for a reduced/simple case where argparse.REMAINDER doesn't work (the case of the first argument being argparse.REMAINDER).

    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 11 years ago

    An alternative to Jason's example:

    parser = argparse.ArgumentParser()
    parser.add_argument('app')
    parser.add_argument('--config')
    parser.add_argument('app_args', nargs=argparse.REMAINDER)
    args = parser.parse_args(['--config', 'bar', 'app'])
    print vars(args)
    # as expected: {'app': 'app', 'app_args': [], 'config': 'bar'}

    When you have several positionals, one or more of which may have 0 arguments (*,?,...), it is best to put all of the optional arguments first.

    With 'app --config bar', parse_args identifies a 'AOA' pattern (argument, optional, argument). It then checks which positional arguments match. 'app' claims 1, 'app_args' claims 2 (REMAINDER means match everything that follows). That leaves nothing for '--config'.

    What you expected was that 'app' would match with the 1st string, '--config' would match the next 2, leaving nothing for 'app_args'.

    In http://bugs.python.org/issue14191 I wrote a patch that would give the results you want if 'app_args' uses '*'. That is makes it possible to interleave positional and optional argument strings. But it does not change the behavior of REMAINDER.

    parser.add_argument('app_args', nargs='*')

    Maybe the documentation example for REMAINDER needs to modified to show just how 'greedy' REMAINDER is. Adding a:

    parser.add_argument('--arg1',action='store_true')

    does not change the outcome. REMAINDER still grabs '--arg1' even though it is a defined argument.

    Namespace(arg1=False, args=['--arg1', 'XX', 'ZZ'], command='cmd', foo='B')
    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 11 years ago

    By the way, parser.parse_args() uses parse_known_arg(). parse_known_args returns a Namespace and a list of unknown arguments. If that list is empty, parse_args returns the Namespace. If the list is not empty, parse_args raises an error.

    So parse_known_args does not change how arguments are parsed. It just changes how the unknowns are handled.

    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 10 years ago

    Here's a possible solution to the problem (assuming there really is one):

    I included a patch from bpo-15112, which delays the consumption of a positional that matches with 0 strings.

    In the sample case for this issue, results with this patch are:

        args = parser.parse_args(['app', '--config', 'bar'])
        # Namespace(app='app', app_args=[], config='bar')
    
        args = parser.parse_args(['--config', 'bar', 'app'])
        # Namespace(app='app', app_args=[], config='bar')
    
        args = parser.parse_args(['app', 'args', '--config', 'bar'])
        # Namespace(app='app', app_args=['args', '--config', 'bar'], config=None)

    In the last case, 'app_args' gets the rest of the strings because the first is a plain 'args'. I believe this is consistent with the intuition expressed in this issue.

    I've added one test case to test_argparse.TestNargsRemainder. This is a TestCase that is similar to the above example.

        argument_signatures = [Sig('x'), Sig('y', nargs='...'), Sig('-z')]
        failures = ['', '-z', '-z Z']
        successes = [
            ('X', NS(x='X', y=[], z=None)),
            ('-z Z X', NS(x='X', y=[], z='Z')),
            ('X A B -z Z', NS(x='X', y=['A', 'B', '-z', 'Z'], z=None)),
            ('X Y --foo', NS(x='X', y=['Y', '--foo'], z=None)),
            ('X -z Z A B', NS(x='X', y=['A', 'B'], z='Z')), # new case
        ]

    This patch runs test_argparse fine. But there is a slight possibility that this patch will cause backward compatibility problems. Some user might expect y=['-z','Z',...]. But that expectation has not been enshrined the test_argparse.

    It may require a slight change to the documentation as well.