python / cpython

The Python programming language
https://www.python.org
Other
63.81k stars 30.55k forks source link

inlined comprehension implementation in symtable - missing test or redundant code #122560

Closed iritkatriel closed 4 months ago

iritkatriel commented 4 months ago

No test fails if I remove the symtable code that modifies the symtable for an inlined comprehension: https://github.com/python/cpython/pull/122557

I don't know whether this step is necessary. Would suggest we find a test covering it or remove it.

Linked PRs

iritkatriel commented 4 months ago

CC @carljm @JelleZijlstra .

devdanzin commented 4 months ago

It seems the PR causes a change in behavior, if I'm testing it right: test.py:

import _symtable
s = _symtable.symtable("[z for z in [x for x in range(10)]]", "", "exec")
print(s.children)

With PR applied:

> .\PCbuild\amd64\python.exe .\test.py
[<symtable entry listcomp(1601623904304), line 1>, <symtable entry listcomp(1601623905712), line 1>]

Without:

> .\PCbuild\amd64\python.exe .\test.py
[]
iritkatriel commented 4 months ago

@devdanzin I don't think the symtable content counts as "behaviour", it's an implementation detail.

JelleZijlstra commented 4 months ago

The symtable module is public, so I do think we should preserve this behavior. Your proposed change would make the structure of the symtable entries inconsistent with the actual behavior.

iritkatriel commented 4 months ago

Your proposed change would make the structure of the symtable entries inconsistent with the actual behavior.

You mean that it would imply that the list comprehension is in a different scope relative to the rest of the function?

JelleZijlstra commented 4 months ago

Yes, and identifiers are duplicated. With your PR:

>>> import symtable
>>> st=symtable.symtable("[x for x in [1]]", "x", "exec")
>>> st.get_identifiers()
dict_keys(['x'])
>>> st.get_children()[0].get_identifiers()
dict_keys(['.0', 'x'])
itamaro commented 4 months ago

for reference / convenience - the snippet of code in question was added in gh-101441 - the original implementation of PEP-709. seems like the issue here is missing test coverage.

iritkatriel commented 4 months ago

I agree that duplicating the identifiers in different scopes is wrong. So I will add a test to ensure this does not happen.

I don't think PEP states that the list comprehensions are in the enclosing scope (it actually says "In effect, comprehensions introduce a sub-scope where local variables are fully isolated, but without the performance cost or stack frame entry of a call.") so I think we should not add a test asserting that.

carljm commented 4 months ago

The meaning of "in effect... a sub-scope" leaves some clarity to be desired, since "sub-scope" is not a well-defined term in Python. In practice what it means is that the visible effect in terms of which values of variables are visible where is as if the comprehension were its own scope, but the implementation is actually all in a single scope. This implementation detail is clearly visible in various ways: at least via code objects, the symtable module, and tracebacks -- there may be others. I think the rest of the PEP text supports this reading (though unfortunately it doesn't comprehensively list every place where this is visible.)

In general, trying to hide the actual implementation and pretend that comprehensions are still a fully separate scope when people are poking around in "exposed internals" like code objects or the symbol table or frames is going to be a never-ending game of whackamole leading to more complexity, not less, so I don't recommend that we go down this path. (This was also the conclusion in the recent PEP 667 discussion around frames.)

All that is to say: I think it would be fine to write a test asserting outright that the comprehension variable appears as a local in the outer scope, because it is (for example, that's also how it appears in the code object). But I'm also fine with the test as written in the linked PR.

iritkatriel commented 4 months ago

I agree about not trying to pretend that the comprehensions are a separate scope. At the same time, I think we should not guarantee that they are not. Things like the structure of the symtable and the frame locals should not be seen as features we guarantee stability for (like bytecode).