python / cpython

The Python programming language
https://www.python.org
Other
63.14k stars 30.23k forks source link

argparse has problem parsing option files containing empty rows #54732

Open 2ef421cc-3fc5-4ebb-ab82-eb03f0352820 opened 13 years ago

2ef421cc-3fc5-4ebb-ab82-eb03f0352820 commented 13 years ago
BPO 10523
Files
  • argparse_example.py
  • argparse_blanklines.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library'] title = 'argparse has problem parsing option files containing empty rows' updated_at = user = 'https://bugs.python.org/MichalPomorski' ``` bugs.python.org fields: ```python activity = actor = 'serhiy.storchaka' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'Michal.Pomorski' dependencies = [] files = ['34860', '34875'] hgrepos = [] issue_num = 10523 keywords = ['patch'] message_count = 4.0 messages = ['122314', '128297', '216249', '216306'] nosy_count = 4.0 nosy_names = ['bethard', 'Michal.Pomorski', 'paul.j3', 'math_foo'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue10523' versions = ['Python 2.7'] ```

    2ef421cc-3fc5-4ebb-ab82-eb03f0352820 commented 13 years ago

    When using the argument file option, i.e @file_with_arguments the following problems arise:

    1. argparse crashes when the file contains an empty line (even if it is the last line) - arg_string[0] is done when arg_string is empty. This is caused by the use of splitlines() instead of strip().split() in the function _read_args_from_files(self, arg_strings)

    2. options separated by spaces in one row are passed as a single long option, meaning each option has to be on its own line. This is caused by the new function

      def convert_arg_line_to_args(self, arg_line):
          return [arg_line]
      which should be 
          return arg_line.split()

    Both problems are caused by a modification in

        def _read_args_from_files(self, arg_strings)

    The version from argparse 1.0.1 worked well and was correct, it should be sufficient to reverse the changes done from 1.0.1 to 1.1.

    Here is the old implementation:

        def _read_args_from_files(self, arg_strings):
            # expand arguments referencing files
            new_arg_strings = []
            for arg_string in arg_strings:
    
                # for regular arguments, just add them back into the list
                if arg_string[0] not in self.fromfile_prefix_chars:
                    new_arg_strings.append(arg_string)
    
                # replace arguments referencing files with the file content
                else:
                    try:
                        args_file = open(arg_string[1:])
                        try:
                            arg_strings = args_file.read().strip().split()
                            arg_strings = self._read_args_from_files(arg_strings)
    
                            new_arg_strings.extend(arg_strings)
                        finally:
                            args_file.close()
                    except IOError:
                        err = _sys.exc_info()[1]
                        self.error(str(err))
    
            # return the modified argument list
            return new_arg_strings
    8955c213-fd54-471c-9758-9cc5f49074db commented 13 years ago

    Crashing on an empty line is definitely a bug.

    Each line being a single option is documented behavior:

    http://docs.python.org/dev/library/argparse.html#fromfile-prefix-chars

    ec221e61-8620-43e0-baee-a53eb720b4dc commented 10 years ago

    The current behaviour takes empty lines and interprets them as empty strings.

    The attached demonstration script shows the error occurring. The first case is a simple example to illustrate what happens in the general case. The second case shows empty lines being interpreted as empty strings and assigned to arguments.

    The third case, despite being very similar to the first, results in argparse exiting with an error message. Internally, what is happening is, after taking the 'foo' and 'baz' arguments and assigning them 'bar' and 'quux' respectively, it reads in an argument "", which it does not recognize. And produces the following error message:

    "argparse_example.py: error: unrecognized arguments:"

    The error message, in it's current form, is kind of opaque.

    For the third case, if we move the blank line to between 'bar' and '-baz', the same error results, as again it tried to interpret the blank line as an argument. If we move the blank line to the start of the file, same thing again.

    If we move the blank line between '-foo' and 'bar', instead the error reads: "argparse_example.py: error: unrecognized arguments: bar" - which is at least somewhat comprehensible.

    The question is, how should blank lines be handled?

    Should they be accepted as possible values for arguments?

    If they fall into spaces where arguments (versus values for arguments) are expected, should we skip them?

    If the current handling is fine, I would propose updating the documentation to add the following after the paragraph that begins "Arguments read from a file ...":

    "By default, blank lines are interpreted as empty strings. An empty string is not an acceptable argument; but it is an acceptable value for an argument."

    And changing the way that the error from argparse is displayed so that it is more obvious what "argparse_example.py: error: unrecognized arguments:" means.

    ec221e61-8620-43e0-baee-a53eb720b4dc commented 10 years ago

    I've attached a patch making the changes I suggested, assuming that the current behaviour is desirable. It documents the behaviour of argparse on files with blank lines and changes the way the error message that argparse generates when encountering unrecognized arguments is generated.

    When a blank line is included at the end of a file, the resulting error message is now: "argparse_example.py: error: unrecognized arguments: ''". This also makes it obvious when the problem is white space, e.g. if an argument has trailing spaces, this also makes that obvious.