Closed oscarbenjamin closed 2 years ago
@pablogsal this sounds like a release blocker. Also cc @markshannon.
Ouch. Looking at it now (@markshannon is at EuroPython and may not be 100% responsive).
I'm able to get a consistent reproducer without based on the original example without SymPy. I'm currently minimizing it further.
Minimal reproducer that fails consistently on main
:
class Metaclass(type):
@property
def attribute(self):
return True
class Class(metaclass=Metaclass):
attribute = False
for _ in range(8):
print(Class.attribute)
It prints:
True
True
True
True
True
True
True
False
This failure occurs immediately after specializing the LOAD_ATTR_ADAPTIVE
into LOAD_ATTR_CLASS
. That specialization is incorrect.
My gut says we should fail to perform this specialization in the presence of a metaclass as a quick fix. Not sure what makes the most sense long term.
The following patch fixes the reproducer:
diff --git a/Python/specialize.c b/Python/specialize.c
index 66cae44fa8..72667c4905 100644
--- a/Python/specialize.c
+++ b/Python/specialize.c
@@ -947,6 +947,10 @@ specialize_class_load_attr(PyObject *owner, _Py_CODEUNIT *instr,
_PyLoadMethodCache *cache = (_PyLoadMethodCache *)(instr + 1);
PyObject *descr = NULL;
DescriptorClassification kind = 0;
+ if (_PyType_Lookup(Py_TYPE(owner), name)) {
+ SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_ATTR_METACLASS_ATTRIBUTE);
+ return -1;
+ }
kind = analyze_descriptor((PyTypeObject *)owner, name, &descr, 0);
switch (kind) {
case METHOD:
@@ -957,12 +961,7 @@ specialize_class_load_attr(PyObject *owner, _Py_CODEUNIT *instr,
return 0;
#ifdef Py_STATS
case ABSENT:
- if (_PyType_Lookup(Py_TYPE(owner), name) != NULL) {
- SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_ATTR_METACLASS_ATTRIBUTE);
- }
- else {
- SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_EXPECTED_ERROR);
- }
+ SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_EXPECTED_ERROR);
return -1;
#endif
default:
I'll just need some time with this to make sure it's both correct and not overly-restrictive. I'd also appreciate a second (or third) pair of eyes on this, if anybody has the time.
That patch is still buggy in other ways. Adding or changing properties on the metaclass doesn't invalidate the caches:
class M(type):
pass
class C(metaclass=M):
a = False
@property
def a(self):
return True
for i in range(9):
if i == 8:
M.a = a
print(C.a)
That should print True
on the last iteration, but doesn't.
I'm thinking that, for 3.11 at least, we should probably just bail on caching class attributes if a metaclass is involved. I wonder if attribute assignment is affected as well.
The bigger issue is that changes to metaclasses don't invalidate tp_version_tag
s of classes that use them. So it seems that we have two choices for 3.12: either don't cache class attributes when a metaclass is involved, or also cache the metaclass tp_version_tag
(maybe in _PyLoadMethodCache.keys_version
?) and check that each time.
I'd love to hear others' thoughts. @Fidget-Spinner, I think you've worked on this code in the past?
Argh, it's not enough to ignore metaclasses. type
itself has descriptors that can trigger this; any class that has a conflicting attribute name is susceptible.
New minimal reproducer:
class C:
__name__ = "Spam"
for _ in range(8):
print(C.__name__)
@brandtbucher yeah I implemented this opcode I think. I'm at Europython too so I'll try to take a look tomorrow.
BTW, LOAD_ATTR_CLASS
is only in 3.12. In 3.11 it's LOAD_METHOD_CLASS
and attributes arent affected.
BTW,
LOAD_ATTR_CLASS
is only in 3.12. In 3.11 it'sLOAD_METHOD_CLASS
and attributes arent affected.
Dang, okay, that'll make the backport tricker. Thanks for letting me know. Do we not cache class attributes in 3.11? If so, how is it different from main
?
I have a branch that fixes main
by caching the metaclass version too (https://github.com/python/cpython/compare/main...brandtbucher:metaclasses-and-descriptors-are-my-favorite). However, I found another bug while working on it: __getattribute__
methods on metaclasses aren't respected either.
Dang, okay, that'll make the backport tricker. Thanks for letting me know. Do we not cache class attributes in 3.11? If so, how is it different from
main
?
Yes we don't specialize for class attributes in 3.11. Main merged method and attribute loading opcodes (ie LOAD_METHOD
is gone). Thats why on main it supports both attribute and method caching.
The fixes on your branch LGTM. What's stopping you from opening a PR? (Also I like the branch name :).
What's stopping you from opening a PR?
Just making sure today that there's a reasonable way to backport the metaclass versioning to 3.11. Also seeing if I can incorporate a fix for the __getattribute__
issue I found in the same PR. I'll have something up today either way.
https://github.com/python/cpython/pull/94980 landed on the 3.11
branch - it should be possible to remove the release blocker now?
I can confirm that SymPy CI is all green now with CPython 3.11.0b5.
I can confirm that SymPy CI is all green now with CPython 3.11.0b5.
Okay, closing. Feel free to reopen if required.
Bug report
This comes from a SymPy issue: https://github.com/sympy/sympy/issues/23774
It looks like there is a nondeterministic error in the optimisation introduced in 96346cb6d0593ef9ec122614347ccb053cd63433 from #27722 first included in CPython version 3.11.0a1. The optimisation speeds up calling methods but sometimes retrieves the wrong attribute from a class whose metaclass defines a property method.
The way that this arises is quite sensitive to small changes so I haven't been able to distil a standalone reproducer. I'll show how to reproduce this using SymPy below but first this is a simplified schematic of the situation:
Here
B
is a subclass ofA
but an instance ofMetaB
. Both definemethod
but theMetaB
method should be used when accessed from the classB
rather than an instanceB()
. The failure seen in SymPy is that sometimesB.method(1)
will execute asA.method(1)
which fails because of the missingself
argument.The following code reproduces the problem and fails something like 50% of the time with SymPy 1.10.1 and CPython 3.11.0a1-3.11.0b4:
The failure is not deterministic:
The failure can be reproduced deterministically by setting the hash seed:
In the debugger the same code that already failed succeeds:
The actual arrangement of SymPy classes is something like this:
Your environment