python / cpython

The Python programming language
https://www.python.org
Other
62.43k stars 29.97k forks source link

argparse exits on error even when exit_on_error=False #103498

Closed sfermigier closed 2 months ago

sfermigier commented 1 year ago

Bug report

When ArgumentParser encounters certain errors, it calls self.error(...) which exits, even when exit_on_error is set to False.

This prevents catching the exception and writing a custom error message.

import argparse

parser = argparse.ArgumentParser(exit_on_error=False)
parser.add_argument('filename')

try:
    args = parser.parse_args([])
except Exception:
    print("Bad argument, here's what you should do instead...")

Test on Python 3.11.3:

python argparse-bug/repro.py
usage: repro.py [-h] filename
repro.py: error: the following arguments are required: filename

Linked PRs

sunmy2019 commented 1 year ago

The feature was added at https://github.com/python/cpython/pull/15362 It looks like the author forgot about this case. A direct modification is acceptable here.

JustBeYou commented 1 year ago

Hi, new contributor here 👋

I written a quick fix which should be generic (instead of guarding all calls to self.error, I added the guard in the method itself). But this way, throwing custom exceptions is not possible. (for example in the PR implementing the feature there was a test expecting a argparse.ArgumentError exception). Is this acceptable or is throwing custom exceptions desired?

If it is desired, we could do the following without breaking the existing API:

  1. def error(self, exception_or_message: Union[str, Exception]) - allow passing both messages and exceptions
  2. def error(self, message: str, exception_class=None) - allow passing exception class information in a separate argument

Personally, I find the first one the simplest and requires little changes to the code. I would be glad if any maintainer could give an opinion.

gaogaotiantian commented 1 year ago

As far as I can tell, all the self.error() usage is reporting an issue with argument parsing(well, its the argparse library of course). In _parse_known_args there are mixed usage of ArgumentError and self.error() - which is pretty bad.

Doing all the error handling in self.error() seems reasonable, but I don't think we need to change the signature in self.error(). Just raise ArgumentError with the message if self.exit_on_error is False - it's easier for user to catch it anyway.

We also need to replace all the raise ArgumentError with self.error() to keep the code clean.

gaogaotiantian commented 1 year ago

Hmm, I did not realize that ArgumentError actually takes an action, that was my fault. I still think the current implementation is better, but there might be a backward compatibility issue involved. It's clearly stated that the user can override error() function. If you added an extra argument and used it, the user's current code might break. I think we need some opinions from the component owner.

JustBeYou commented 1 year ago

No, I did not reply to the email directly, I just wanted to edit the comment because I noticed the compatibility issue after I written it, but I deleted it instead. :sweat_smile:

Anyway, I followed your advice, but stumbled upon this action parameter that breaks the API for classes that inherit ArgumentParser. We could avoid this by using **kwargs, but it is not the cleanest way to solve it.

gaogaotiantian commented 1 year ago

Taking at the look at the code, when this feature was intruduced, it was not very thoroughly tested/reviewed. Basically all the self.error() usage need to be vetted - by nature it contradicts this feature. So this feature either has to live in self.error(), or all the current self.error() usage need to be protected by checking this option.

jacobtylerwalls commented 1 year ago

For triage's sake, I think this is a duplicate of #85427, which has a PR waiting for review at #30832.

But this way, throwing custom exceptions is not possible. (for example in the PR implementing the feature there was a test expecting a argparse.ArgumentError exception). Is this acceptable or is throwing custom exceptions desired?

There's an even older duplicate with some discussion on this point that shaped the choices I made in #30832, see https://github.com/python/cpython/pull/30832#issuecomment-1024819272.

baolsen commented 1 year ago

+1 from my side. Some context below on what I am trying to do - it could help to show a use case.

I'm writing unit tests for an application, roughly as follows:

import unittest
from common import my_cli_parser

from argparse import ArgumentError

class ParserTest(unittest.TestCase):
    def setUp(self):
        argparser_kwargs={'exit_on_error': False}
        self.parser = my_cli_parser.create_parser(**argparser_kwargs)

    # First test, this passes
    def test_invalid_subcommand(self):
        args = ['an_invalid_subcommand']
        with self.assertRaises(ArgumentError):
            self.parser.parse_args(args)

    # Second test, this fails
    def test_valid_subcommand_but_not_all_arguments(self):
        # More arguments are required here for the subcommand
        # but intentionally I don't provide them
        args = ['a_valid_subcommand']
        with self.assertRaises(ArgumentError):
            self.parser.parse_args(args)

Currently, my first test passes, showing that argparse is honouring the exit_on_error=True.

However, the second test fails, with "the following arguments are required" (correct) and a system exit (in my opinion, not correct).

Reason: The my_cli_parser requires some arguments for the subcommand, which my test is intentionally not passing. I want to validate this test case. So I set exit_on_error=False - it describes the behaviour I want.

I expect ArgumentError to be raised, and system exit is disabled, like the first case.

But current behaviour is that the argparse.exit is called anyway.

I know I can workaround by subclassing (or mocking + patching) but I think the behaviour here is a little inconsistent with the documentation for exit_on_error and what a user might reasonably assume that parameter does.

hpaulj commented 1 year ago

I commented on this an another current issue, https://github.com/python/cpython/issues/103558#issuecomment-1513657233

exit_on_error was added without a thorough review. Catching the ArgumentError is a clean implementation, but doesn't catch all calls to self.error. ArgumentError requires an an Action, so can't be used for required arguments or groups, or unrecognized ones.

We need to think carefully about capturing these other self.error calls. They aren't Exception type errors.

olgarithms commented 1 year ago

I also came across the same issue where creating an ArgumentParser with exit_on_error=False and passing the wrong type of argument would raise an argparse.ArgumentError, but passing an unknown argument would exit the task.

import argparse

parser = argparse.ArgumentParser(exit_on_error=False)
parser.add_argument("--integers", type=int)
parser.parse_args(["--integers", "a"]).  
> argparse.ArgumentError: argument --integers: invalid int value: 'a'
parser.parse_args(["--unknown", "a"])
> usage: olga.py [-h] [--integers INTEGERS]
> olga.py: error: unrecognized arguments: --unknown a
hpaulj commented 1 year ago

Yes, as currently implemented exit_on_error only controls errors that raise a argparse.ArgumentError. The unrecognized arguments exit does not use ArgumentError (or an error Exception class) , and so cannot be caught in the same way.

A more comprehensive way of changing error handling is to customize either the exit or `error methods:

https://docs.python.org/3/library/argparse.html#exiting-methods

gaogaotiantian commented 1 year ago

I think we are basically at a dead loop here - we want to keep the backward compatibility so we don't breaking anyone's existing code, but the current behavior is simply wrong. It's impossible to "fix" the behavior without breaking existing code.

The introduction of exit_on_error does not live well with all the sys.exit() usage in argparse. There could be a lot of similar traps unless we patch all the sys.exit()s (self.error()).

Anyone who had to make a decision about fixing this would carry plenty of pressure for breaking people's code and I believe according to @sobolevn none of the core devs are the experts of argparse now. I'm curious if/how we could proceed on this.

JustBeYou commented 1 year ago

I took a deeper dive into the problem and I may have a solution. I think the main goals for the fix are:

First, to make the error handling consistent, I propose that we convert all calls to self.error() to raising exceptions (in case ArgumentError does not fit properly in some cases, we can think of a new public exception type, but I think we won't need that).

Second, we make all public methods of ArgumentParser (there are few) to handle exit_on_error in an obvious way. If exit_on_error=False, we allow exception to pass through, otherwise we pass their string representation to self.error which will do its thing and finally exit.

I think that if we use self.error only in simple try-catch blocks in the public methods, it would be easy enough to make the right behaviour happen and avoid future mistakes.

I will update the pull request to reflect what I was talking about and I'm looking for suggestions/opinions. Maybe we succeed to move this forward.

Below is a list of places where self.exit/self.error are used.

Calls to ArgumentParser.exit which are justified and probably do not need to be changed:

_HelpAction.__call__
_VersionAction.__call__

Calls to ArgumentParser.error that should be converted to raising exceptions:

ArgumentParser.add_subparsers
ArgumentParser.parse_args
ArgumentParser.parse_known_args
ArgumentParser.parse_intermixed_args

ArgumentParser._parse_known_args
ArgumentParser._read_args_from_files
ArgumentParser._parse_optional
ArgumentParser._get_option_tuples
gaogaotiantian commented 1 year ago

I took a deeper dive into the problem and I may have a solution. I think the main goals for the fix are:

  • raise exceptions for all kinds of errors ArgumentParser could encounter
  • prevent future bugs related to exit_on_error=False by calling .exit() indirectly by accident
  • stay as backward compatible as possible

I'm sure there are ways to make the code better, but I believe the major issue here is not how, it is backward compatibility. For example, what if the user are subclassing ArgumentParser and using self.error() in their code? Your change would break the behavior.

Like I said above, the real issue here is the library was used too much by the users and all the implementation details are considered "documented feature". We either keep the old "wrong" behavior, or break users existing code.

The hard part is to make the decision - and that's where we are stuck I believe. We need core devs to support a reform, even if that means breaking some of the user code. Without that, I don't believe there would be an implemetation feasible.

alonbl commented 10 months ago

Hi,

The exit_on_error=False is also used to ignore undefined arguments and return args collection of what argparse succeeded to parse. It should not raise exception as then the parse will not be able to return its value.

Regards,

hpaulj commented 10 months ago

parse_known_args can be used to return the unknowns along with the parsed ones. This alternative was available long before this exit_on_error parameter was introduced.

The original, default behavior was exit_on_error=True. I doubt if there is much code that depends on the current incomplete implementation of the False alternative. The question is whether we can make the False coverage more complete without too much work, and without compromising the True case.

On Sun, Nov 5, 2023, 12:50 PM Alon Bar-Lev @.***> wrote:

Hi,

The exit_on_error=False is also used to ignore undefined arguments and return args collection of what argparse succeeded to parse. It should not raise exception as then the parse will not be able to return its value.

Regards,

— Reply to this email directly, view it on GitHub https://github.com/python/cpython/issues/103498#issuecomment-1793841938, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITB6ADV42JLQ32ZLWITZDYC73YVAVCNFSM6AAAAAAW4SRXYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTHA2DCOJTHA . You are receiving this because you commented.Message ID: @.***>

hpaulj commented 10 months ago

I think before someone proposes a push, we need a comprehensive list of errors that still exit.

`exit_on_error=False currently redirects the ArgumentError cases, which are tied to a specific argument. By default that error exits with a usage and argument specific message.

unknown arguments exit is produced by parse_args, and isn't tied to any one argument. parse_known_args has, and still is available for bypassing that. I'm not sure that needs any further change.

Mutually_exclusive_groups can produce errors; I don't know if exit_on_error addresses those

There's also a test for required_arguments. That can affect multiple arguments, so it isn't an ArgumentError.

I'm working here from memory, so there may be others.

On Sun, Nov 5, 2023 at 2:07 PM paulj @.***> wrote:

parse_known_args can be used to return the unknowns along with the parsed ones. This alternative was available long before this exit_on_error parameter was introduced.

The original, default behavior was exit_on_error=True. I doubt if there is much code that depends on the current incomplete implementation of the False alternative. The question is whether we can make the False coverage more complete without too much work, and without compromising the True case.

On Sun, Nov 5, 2023, 12:50 PM Alon Bar-Lev @.***> wrote:

Hi,

The exit_on_error=False is also used to ignore undefined arguments and return args collection of what argparse succeeded to parse. It should not raise exception as then the parse will not be able to return its value.

Regards,

— Reply to this email directly, view it on GitHub https://github.com/python/cpython/issues/103498#issuecomment-1793841938, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITB6ADV42JLQ32ZLWITZDYC73YVAVCNFSM6AAAAAAW4SRXYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTHA2DCOJTHA . You are receiving this because you commented.Message ID: @.***>

androidkh commented 9 months ago

Hi all,

the issue still exists for me in version 1.4.0. Error stack trace: ...

argparse.py:1829: in parse_args
    self.error(msg % ' '.join(argv))
argparse.py:2587: in error
    self.exit(2, _('%(prog)s: error: %(message)s\n') % args)
argparse.py:2574: in exit
    _sys.exit(status)

As I can see, on line 1857 it actually tries to verify 'if self.exit_on_error' and then use try..catch but in fact it doesn't reach that block as at line 1827 it goes to self.error that later leads to exit

hpaulj commented 9 months ago

The preceding line is

 msg = _('unrecognized arguments: %s')

This is the unrecognized arguments error that has been discussed in this thread. Contrary to what the docs imply, the 'don;t exit' parameter does not catch every kind of error. At the moment it just catches ArgumentError ones that are tied to one specific Argument.

Broadening its coverage has been discussed, but, as far as I know, not been implemented.

On Mon, Dec 11, 2023 at 9:50 AM Andrii @.***> wrote:

Hi all,

the issue still exists for me in version 1.3.0. Error stack trace: ...

argparse.py:1829: in parseargs self.error(msg % ' '.join(argv)) argparse.py:2587: in error self.exit(2, ('%(prog)s: error: %(message)s\n') % args) argparse.py:2574: in exit _sys.exit(status)

As I can see, on line 1857 it actually tries to verify 'if self.exit_on_error' and then use try..catch but in fact it doesn't reach that block as at line 1827 it goes to self.error that later leads to exit

— Reply to this email directly, view it on GitHub https://github.com/python/cpython/issues/103498#issuecomment-1850579307, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITB6FL6VKD6IJCNZWKVP3YI5BVHAVCNFSM6AAAAAAW4SRXYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJQGU3TSMZQG4 . You are receiving this because you commented.Message ID: @.***>

olgarithms commented 4 months ago

@ericvsmith looking into this!

serhiy-storchaka commented 2 months ago

I wrote #121056 before discovering this issue. It fixes the issue in the way opposite to #103519: internal error() calls are replaced with raising ArgumentError. It is then caught and passed to error() at the higher level. Basically, it implements the idea proposed in https://github.com/python/cpython/issues/103498#issuecomment-1535118409.

We now have 3 duplicate issues with 4 open PRs (and one incomplete PR was already merged). It is time to finish this.