Open markshannon opened 2 years ago
Are you working on this?
Are you working on this?
No. If you want to do this, just assign yourself. It might be worth waiting for 3.11 to calm down a bit first.
I imagine that this will piggyback off of https://github.com/python/cpython/pull/93988, since EnumType
defines __getattr__
?
__getattr__
is no longer necessary, and will be removed in 3.12 (since I forgot to do it in 3.11).
If
Color.RED
is not a descriptor when specialized, it will not be a descriptor when executed. Therefore, we can specialize it in the same way as immutable, non-descriptor class attributes usingLOAD_ATTR_CLASS
.
Even if Color.__dict__['RED']
is a descriptor, its result will still always be the member Color.RED
when called on the class. Is that optimizable?
In case it matters, the exact descriptor would be enum.property
-- the big difference between it and the standard property
is the behavior when called on the class -- either the matching enum member, or an AttributeError
.
Even if
Color.__dict__['RED']
is a descriptor, its result will still always be the memberColor.RED
when called on the class. Is that optimizable?
Interesting. Looking at it now, I'm not sure why LOAD_ATTR_CLASS
currently only handles methods and non-descriptors. As you note, it should be able to handle anything (at least anything without a descriptor in the metaclass, which we already guard against), including descriptors in the class __dict__
.
@Fidget-Spinner?
@brandtbucher it's probably an oversight in how I implemented it. I was being conservative at the time (also I admit just using the direct results from the analyze_descriptor
function for simplicity).
Some complications with specialising enums:
We probably need a specialisation for metaclasses, like what Brandt suggested in his other PR.
Inheriting from enums uses metaclasses. Which we (just) disabled.
All classes have a metaclass. The issue is whether the metaclass is mutable, has overrides, or shadows class attributes.
This is much the same as for normal instances and their classes, which is why we have analyze_descriptor
The enum metaclass is a mutable class.
This is the only problem here, AFAICT. The easy fix is to make EnumType
immutable. It doesn't need to be mutable.
A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.
The enum property is a descriptor.
Properties are descriptors. Which property are you talking about?
Properties are descriptors. Which property are you talking about?
This part is wrong. I initially assumed enum members defined __get__
and __set__
. I don't think they do.
This part is wrong. I initially assumed enum members defined
__get__
and__set__
. I don't think they do.
No, I think it's just name
and value
that do that. Still something to consider, though.
I was being conservative at the time (also I admit just using the direct results from the
analyze_descriptor
function for simplicity).
We're probably going to want to revisit this soon. Looking at it with fresh eyes, things like the GETSET_OVERRIDDEN
checks are incorrect in this context too.
This is the only problem here, AFAICT. The easy fix is to make
EnumType
immutable. It doesn't need to be mutable. A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.
It seems to me that a much simpler solution would just be to version the metaclass too (https://github.com/python/cpython/compare/main...brandtbucher:cpython:load-attr-metaclass).
It seems to me that a much simpler solution would just be to version the metaclass too
Simpler, but slower. It is probably the best thing to do for now, we can always do the more complex thing later.
The enum metaclass is a mutable class.
This is the only problem here, AFAICT. The easy fix is to make
EnumType
immutable. It doesn't need to be mutable. A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.
Is there anything I can do to make EnumType
immutable?
As far as I know, the only way is to set the right flags on the type at the C level. So, short of reimplementing EnumType
in C, there probably isn't a simple way to do it without hacks (like exposing a C function somewhere that you can decorate your type with).
So let's just version the metaclass for now. As soon as EnumType
removes its __getattr__
function (and we iron out some of the quirks of class attribute specialization), we'll be able to specialize Enum
(and many others, like sympy
) attribute accesses without resorting to hacks.
I'll open a PR in a bit.
This part is wrong. I initially assumed enum members defined
__get__
and__set__
. I don't think they do.No, I think it's just
name
andvalue
that do that. Still something to consider, though.
In EnumClass.__dict__
the member names, along with value
and name
and any other enum.property
s assigned by the user, are descriptors. The members themselves are not.
Ah, thanks.
Also, I was wrong earlier: I totally forgot that descriptor __get__
methods still need to be called (without an instance) on class attribute accesses too. So it's not safe to cache any descriptors here, as I previously thought. :(
In
EnumClass.__dict__
the member names ... are descriptors.
At least they will be once the performance is improved (which I believe it is even now in 3.12) -- in 3.11 they are only descriptors if the enum class defines a member which matches a property in the inheritance chain:
class ConfusingEnumBase(Enum):
@enum.property
def attribute(self):
return 'example'
class ConfusingEnum(ConfusingEnumBase):
cls = 1
decorator = 2
attribute = 3
In the above, the cls
and decorator
members live directly in ConfusingEnum.__dict__
, while attribute
is a descriptor in ConfusingEnum.__dict__
.
Also, I was wrong earlier: I totally forgot that descriptor
__get__
methods still need to be called (without an instance) on class attribute accesses too. So it's not safe to cache any descriptors here, as I previously thought. :(
Even if the result of calling __get__
on the class never changes? Is there a C hack to mark such things as immutable so it only needs to be called once and then the result cached?
Even if the result of calling
__get__
on the class never changes? Is there a C hack to mark such things as immutable so it only needs to be called once and then the result cached?
We don't cache the results of function calls anywhere, since in general that's an extremely fragile semantic break. We do things like caching the descriptors themselves, but not the result of calling any Python code. It also puts us in a really nasty situation that we've so far avoided: storing strong references in the inline caches.
Realistically, the best we can probably do without adding caches is caching the descriptor and inlining the __get__
call. But I'm not sure if that will make enough difference to matter (and we still won't have __getattr__
handling until something like #93988 is in).
I'm not in an environment where I can research this so I'll just ask: is the enum specialization being discussed here limited only to stdlib enums? I ask because I've been thinking about resurrecting my flufl.enum
package (a precursor to the stdlib library). I'd promote it as a limited, simplified enum library. It might not get much use and I'm not sure whether it would even benefit from enum-specific specialization, but I'd like to know if this is a general specification or something that only benefits the stdlib package (not that that's a bad thing).
is the enum specialization being discussed here limited only to stdlib enums
Most likely not. The specializations work on things with the same "shape", "type" or belonging in the same "family" (behavior-wise). So anything specializing stdlib enums would also work for anything that has the same characteristics (ie, mutable metaclass, attribute belonging to a class).
If your enum implementation is simpler, we likely already specialize for it. If you want to check the specialization works, run your enums in a loopy function then pass that to dis.dis(f, adaptive=True)
in 3.12. If you see weird LOAD_ATTR
opcodes (other than LOAD_ATTR_ADAPTIVE
. It means it worked.
Is it currently possible in a satisfying way to make EnumType immutable? Unfortunately, in the current situation, implementing mixins which are inherited from mutable objects has no sense and implementing immutability on the user's side is very hard (if at all possible).
The above comment is based on this Stackoverflow question. As I mention there, the immutability of EnumType
has no bearing on enum members being mutable (and it's the mutability of the members that is the problem).
Currently we don't specialize loading of enum attributes.
We can take advantage of the fact that
type(Color.RED) == Color
when specializing. Since we check the version number ofColor
inLOAD_ATTR_CLASS
, we are implicitly checking the version oftype(Color.RED)
. IfColor.RED
is not a descriptor when specialized, it will not be a descriptor when executed. Therefore, we can specialize it in the same way as immutable, non-descriptor class attributes usingLOAD_ATTR_CLASS
.@Fidget-Spinner @brandtbucher