python / cpython

The Python programming language
https://www.python.org/
Other
61.16k stars 29.52k forks source link

Unhelpful UnboundLocalError due to del'ing of exception target #61992

Closed warsaw closed 8 months ago

warsaw commented 11 years ago
BPO 17792
Nosy @warsaw, @ncoghlan, @tiran, @benjaminp, @ezio-melotti, @bitdancer, @pmp-p, @ericsnowcurrently, @serhiy-storchaka, @phmc, @MojoVampire, @ilevkivskyi, @iritkatriel, @aaronwsmith
PRs
  • python/cpython#24976
  • Files
  • issue17792.diff
  • issue17792-2.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['expert-tkinter', 'type-crash', '3.11'] title = "Unhelpful UnboundLocalError due to del'ing of exception target" updated_at = user = 'https://github.com/warsaw' ``` bugs.python.org fields: ```python activity = actor = 'sppraveenkumarinfo' assignee = 'none' closed = False closed_date = None closer = None components = ['Tkinter'] creation = creator = 'barry' dependencies = [] files = ['29936', '29989'] hgrepos = [] issue_num = 17792 keywords = ['patch'] message_count = 21.0 messages = ['187313', '187315', '187319', '187320', '187322', '187326', '187327', '187328', '187339', '187342', '187382', '187433', '187627', '187639', '187640', '187645', '389264', '394902', '394903', '402980', '402992'] nosy_count = 15.0 nosy_names = ['barry', 'ncoghlan', 'christian.heimes', 'benjamin.peterson', 'ezio.melotti', 'Arfrever', 'r.david.murray', 'pmpp', 'eric.snow', 'serhiy.storchaka', 'pconnell', 'josh.r', 'levkivskyi', 'iritkatriel', 'aaronwsmith'] pr_nums = ['24976'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'crash' url = 'https://bugs.python.org/issue17792' versions = ['Python 3.11'] ```

    warsaw commented 11 years ago

    As described here:

    http://www.wefearchange.org/2013/04/python-3-language-gotcha-and-short.html

    the following code will produce an UnboundLocalError when the exception is triggered:

    def bad():
        e = None
        try:
            do_something()
        except KeyError as e:
            print('ke')
        except ValueError as e:
            print('ve')
        print(e)

    The reason is that the exception handling machinery del's e from the local namespace in order to break circular references caused by traceback. The ULE is pretty mysterious and certainly not helpful for figuring out what's going wrong (the blog post above describes how long it took me to find it ;).

    Can we do better? What if instead of del'ing the target, we set it to None? We wouldn't get a ULE but the fact that print(e) would print None might be just as mysterious. Any other ideas?

    (BTW, there's likely nothing to be done for Python \< 3.4.)

    ezio-melotti commented 11 years ago

    Maybe we could raise a warning when the deleted name already exists in the local namespace?

    (FWIW I knew about the fact that the name gets deleted, and still managed to get bitten by it a couple of times.)

    warsaw commented 11 years ago

    On Apr 19, 2013, at 12:01 AM, Ezio Melotti wrote:

    Maybe we could raise a warning when the deleted name already exists in the local namespace?

    Ideally, I think a SyntaxError if you could detect a previously bound name in the namespace being used as an exception target.

    -Barry

    bitdancer commented 11 years ago

    And what if it weren't a print statement? An error is better than a "randomly" changed value, I think. I'm really not sure there is anything we can do here, beyond Ezio's warning suggestion.

    bitdancer commented 11 years ago

    This used to raise a SyntaxError. See bpo-4617.

    warsaw commented 11 years ago

    On Apr 19, 2013, at 12:37 AM, R. David Murray wrote:

    And what if it weren't a print statement? An error is better than a "randomly" changed value, I think. I'm really not sure there is anything we can do here, beyond Ezio's warning suggestion.

    Right. The point is, is it more helpful to have an unexpected value in the local (e.g. None, which in fact may not be *totally* unexpected) or an mystical exception? ;)

    A warning would be fine. I just want to give programmers some help in trying to figure out why their code is wrong.

    ncoghlan commented 11 years ago

    It seems to me that this kind of dubious-but-legal code is what http://docs.python.org/3/library/exceptions.html#SyntaxWarning is designed to handle.

    Something like "SyntaxWarning: implicitly deleted exception variable referenced after end of except clause".

    bitdancer commented 11 years ago

    It occurs to me that one of the problems here is that the UnboundLocalError only occurs when the except clause is executed. Do you think it would be helpful and acceptable (in 3.4, as it is a behavior change) to have the 'as' variable *always* deleted at the end of the try/except block? Then you would always get the UnboundLocalError, instead of it mysteriously appearing only when the except was executed.

    Granted it is a bit weird, but then so is the variable getting deleted in the first place.

    ncoghlan commented 11 years ago

    I don't think we want to mess with the control flow, as that would be even weirder and may interact strangely with else and finally clauses.

    A SyntaxWarning would be unconditional (so it doesn't matter whether or not the exception is triggered at run time), happens at compile time (so developers will see it), but won't be triggered when loading from a cached bytecode file (so end users typically won't see it).

    ezio-melotti commented 11 years ago

    Attached a sketchy proof of concept. It only checks for locals, but it should probably check for globals too. Does the approach make sense?

    ericsnowcurrently commented 11 years ago

    FWIW this has come up before:

    http://mail.python.org/pipermail/python-dev/2012-October/122504.html

    and relatedly:

    bpo-16429

    warsaw commented 11 years ago

    Ezio, the problem with your patch is that it also gives a warning on this code, which is totally safe:

    def good():
        exc = None
        try:
            bar(int(sys.argv[1]))
        except KeyError as e:
            print('ke')
            exc = e
        except ValueError as e:
            print('ve')
            exc = e
        print(exc)
    ezio-melotti commented 11 years ago

    Attached a new patch that improves the following things:

    There are still several problems though:

    While compiling I got this warning: ./setup.py:330: SyntaxWarning: "name 'why' is already defined but implicitly deleted after end of except clause" except ImportError as why:

    This comes from a code like: try: foo() except Exception as why: pass try: bar() except Exception as why: pass

    and in this case there shouldn't be any warning. A possible way to fix this is to keep a "whitelist" of locals that are first defined as an except target, but then it will still break if there's a why = ... between the two try/except and it's starting to get too complicated...

    serhiy-storchaka commented 11 years ago
    • Using PyUnicode_FromFormat() works, but then I don't know how to convert the result to const char*
    PyUnicode_AsUTF8()
    ezio-melotti commented 11 years ago

    I considered that, but is it guaranteed that the output encoding is UTF-8? What if the warning is printed on a Windows console that uses cp1252? I also considered encoding it and then use PyBytes_AsString, but I was hoping in something simpler (also I'm not sure where to get the right encoding).

    serhiy-storchaka commented 11 years ago

    I considered that, but is it guaranteed that the output encoding is UTF-8?

    PyErr_WarnExplicit() calls PyUnicode_FromString() which made an inverse decoding from UTF-8.

    iritkatriel commented 3 years ago

    I think the issue is that the error message for UnboundLocalError is wrong, see this example:

    >>> def g():
    ...    x = 42
    ...    del x
    ...    print(x)
    ...
    >>> g()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 4, in g
    UnboundLocalError: local variable 'x' referenced before assignment
    >>>

    How about we change it to "local variable 'x' referenced before assignment or after deletion"?

    iritkatriel commented 3 years ago

    New changeset 7b1f527d5b37dc3aa085ebbe11a1a2dd29ef210f by Irit Katriel in branch 'main': bpo-17792: more accurate error message for unbound variable access exceptions (GH-24976) https://github.com/python/cpython/commit/7b1f527d5b37dc3aa085ebbe11a1a2dd29ef210f

    iritkatriel commented 3 years ago

    Following a discussion on PR24976 we opted for

    "cannot access local variable 'x' where it is not associated with a value"

    This is because binding it not always related to assignment.

    I don't know whether this completely resolves this issue, or whether there is still something to change around the None-setting of except targets.

    8f246f51-6026-4475-811b-97ce21ff8ede commented 2 years ago

    I encountered the similar behavior unexpectedly when dealing with LEGB scope of names. Take the following example run under Python 3.9.2:

    def doSomething():
        x = 10
        del x
        print(x)
    
    x = 5
    doSomething()

    This produces a UnboundLocalError at print(x) even though "x" can still be found in the global scope. Indeed if your add print(globals()) before the print(x) line, you can see "x" listed.

    By contrast, LEGB scope behavior works as expected in this example:

    def doSomething():
        print(x)
    
    x = 5
    doSomething()

    The former example yielding the UnboundLocalError when dealing with name scope feels like a bug that lines up with the original behavior described in this enhancement request, as I believe "x" is still a bounded name in the global scope, but was explicitly deleted from the local scope.

    99ffcaa5-b43b-4e8e-a35e-9c890007b9cd commented 2 years ago

    Aaron: Your understanding of how LEGB works in Python is a little off.

    Locals are locals for the *entire scope of the function, bound or unbound; deleting them means they hold nothing (they're unbound) but del can't actually stop them from being locals. The choice of whether to look something up in the L, E or GB portions of LEGB scoping rules is a *static choice made when the function is defined, and is solely about whether they are assigned to anywhere in the function (without an explicit nonlocal/global statement to prevent them becoming locals as a result).

    Your second example can be made to fail just by adding a line after the print:

    def doSomething():
        print(x)
        x = 1

    and it fails for the same reason:

    def doSomething():
        x = 10
        del x
        print(x)

    fails; a local is a local from entry to exit in a function. Failure to assign to it for a while doesn't change that; it's a local because you assigned to it at least once, along at least one code path. del-ing it after assigning doesn't change that, because del doesn't get rid of locals, it just empties them. Imagine how complex the LOAD_FAST instruction would get if it needed to handle not just loading a local, but when the local wasn't bound, had to choose *dynamically* between:

    1. Raising UnboundLocalError (if the value is local, but was never assigned)
    2. Returning a closure scoped variable (if the value was local, but got del-ed, and a closure scope exists)
    3. Raising NameError (if the closure scope variable exists, but was never assigned)
    4. Returning a global/builtin variable (if there was no closure scope variable *or* the closure scope variable was created, but explicitly del-ed)
    5. Raising NameError (if no closure, global or builtin name exists)

    That's starting to stretch the definition of "fast" in LOAD_FAST. :-)

    serhiy-storchaka commented 8 months ago

    Is this issue already resolved?

    terryjreedy commented 8 months ago

    Yes, #24976 as described in https://github.com/python/cpython/issues/61992#issuecomment-1093614265 is the solution. Anyone wanting further change should open a new issue.