python / cpython

The Python programming language
https://www.python.org
Other
61.8k stars 29.71k forks source link

isinstance is calling ob.__getattribute__ as a fallback instead of object.__class__.__get__ #76864

Open df79943f-4aee-4531-a00d-c6b12816eb70 opened 6 years ago

df79943f-4aee-4531-a00d-c6b12816eb70 commented 6 years ago
BPO 32683
Nosy @rhettinger, @pfmoore, @stevendaprano, @bitdancer, @ethanfurman, @mr-nfamous, @P403n1x87

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['interpreter-core', '3.7', 'library'] title = 'isinstance is calling ob.__getattribute__ as a fallback instead of object.__class__.__get__' updated_at = user = 'https://github.com/mr-nfamous' ``` bugs.python.org fields: ```python activity = actor = 'steven.daprano' assignee = 'none' closed = False closed_date = None closer = None components = ['Interpreter Core', 'Library (Lib)'] creation = creator = 'bup' dependencies = [] files = [] hgrepos = [] issue_num = 32683 keywords = [] message_count = 16.0 messages = ['310793', '310814', '310823', '408122', '408128', '408131', '408133', '408135', '408136', '408141', '408165', '408170', '408171', '408172', '408188', '408192'] nosy_count = 7.0 nosy_names = ['rhettinger', 'paul.moore', 'steven.daprano', 'r.david.murray', 'ethan.furman', 'bup', 'Gabriele Tornetta'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue32683' versions = ['Python 3.7'] ```

df79943f-4aee-4531-a00d-c6b12816eb70 commented 6 years ago

It doesn't even care if the result it gets from ob.__getattribute("__class") isn't a type.

Surely isinstance is intended to do what it is named and be guaranteed to return True if an object ob is an instance of class cls. There are already ways to trick isinstance into thinking objects are instances of types not present in its mro, ie with abstract base classes. All this does is introduce the possibility of bugs.

inspect.isclass in particular is affected. If there's some obscure use case for breaking isinstance in this manner, surely it shouldn't apply to inspect.isclass which according to its documentation, "Return true if the object is a class."

from inspect import isclass
class Liar:
    def __getattribute__(self, attr):
        if attr == '__class__':
            return type
        return object.__getattribute__(self, attr)
>>> islcass(Liar())
True
bitdancer commented 6 years ago

I don't think this is a bug. There are many ways to lie in Python. If your object lies, it is on your head when things break :) On the flip side, the ability to lie is very handy in many circumstances, and is often a case of duck typing rather than lying. Unless I'm missing something, __getattribute__ is how attributes are accessed, so there would have to be a strong reason to deviate from that.

stevendaprano commented 6 years ago

Dan, I don't understand what you think the code snippet shows: you're calling isclass on an object which *actually is a class* and show that it returns True. What did you expect it to return? How does the code snippet you give demonstrate a problem?

You say:

"isinstance is calling ob.__getattribute as a fallback instead of object.__class.__get__"

but why do you think that is wrong, and what makes you think that object.__class.__get is the right way to do it? You seem to think it is self-evident, but it isn't (at least not to me).

"It doesn't even care if the result it gets from ob.__getattribute("__class") isn't a type."

Should it care?

c9f73e94-5cae-4a7c-81f2-f7364bbee5df commented 2 years ago

The following example shows isinstance causing a side effect

class Side:

    class Effect(Exception):
        pass

    def __getattribute__(self, name):
        raise Side.Effect()

isinstance(Side(), str)

I'd be inclined to see this as a bug as I wouldn't expect isinstance to cause any side effects.

stevendaprano commented 2 years ago

I'd be inclined to see this as a bug in your code, if you are causing side-effects from __getattribute__. If you don't want attribute access to cause side-effects, then don't put code in __getattribute__ that causes side-effects :-)

isinstance has to check the object's __class__, if it has one. To do that it has to look at obj.class, which your class intercepts using __getattribute__ and causes a side-effect.

Overloading __getattribute__ is perilous, because it intercepts all instance attribute lookups. In my opinion, one should (almost?) never overload the __getattribute__ method, it is safer to overload __getattr__.

In any case, I don't think there is anything to fix here. Dan has not responded since his initial bug report nearly four years ago, so I'm inclined to close this as "Not A Bug". Unless somebody gives a more convincing reason why the current behaviour is wrong, I think we should close it.

c9f73e94-5cae-4a7c-81f2-f7364bbee5df commented 2 years ago

I think the issue on display here is that isinstance could cause a side effect, which I dare say it's unexpected (and not documented AFAIK). Are there any reasons why __class cannot be retrieved with object.__getattribute instead? In fact, considering that __getattribute__ could be overridden, this would be the right thing to do imho, else isinstance would break spectacularly, like in my previous example.

stevendaprano commented 2 years ago

If you don't want to customise attribute access, don't overload __getattribute__. The documentation for __getattribute__ is clear about what it does:

"Called unconditionally to implement attribute accesses for instances of the class."

https://docs.python.org/3/reference/datamodel.html#object.\_\_getattribute__

That includes lookups for __class__, regardless of whether the function doing the lookup is isinstance or some other function.

You ask:

"Are there any reasons why __class cannot be retrieved with object.__getattribute instead?"

Yes, that would ignore overloaded attribute access, which would be a bug in my opinion.

Being able to dynamically generate computed attributes by overloading __getattr__ and __getattribute__ is not a bug. It is part of Python's data model. See the documentation above.

And that includes __class__.

I don't believe that it is an accident or a bug that isinstance uses ordinary attribute lookup, which goes through the standard mechanism including the class' own __getattribute__ method. But I will ask on the Python-Dev mailing list in case Guido or other core developers think that it is wrong to do so.

pfmoore commented 2 years ago

I tend to agree with Steven and David here. You define __getattribute and so that's the behaviour you get when an attribute of the class is requested (whether by the system or by your code). The documentation (here: https://docs.python.org/3/reference/datamodel.html#object.\_\_getattribute) seems to support this view as well.

Do you have a real-world example of code that is broken by this behaviour, or is this just a theoretical problem? Is it particularly hard to make the code work the way you want it to with the current behaviour? For example,

    # Don't intercept __class__
    if attr == "__class__":
        return object.__getattribute__(self, attr)
c9f73e94-5cae-4a7c-81f2-f7364bbee5df commented 2 years ago

I tend to agree with Steven and David here. You define __getattribute__ and so that's the behaviour you get when an attribute of the class is requested (whether by the system or by your code).

I would agree if it was obvious or explicitly stated that isinstance looks up __class using the object's __getattribute, and thus prone to cause side-effects. It wasn't obvious to me that isinstance would access any attributes from the object, but that it would rather get the object's type in a more direct way (at the C level for CPython).

Do you have a real-world example of code that is broken by this behaviour, or is this just a theoretical problem?

I was looking at some profiling data for a real-life project (observability) that I'm working on. I was using a simple Django application as a target and noticed calls to a __getattribute (implemented by a Django class) that I wasn't expecting. All my code was doing was to check that a certain object was not of "callable" type using isinstance. Whilst I believe no side effects were caused by this particular instance of __getattribute, it should be noted that "Django" is a variable here: any other target framework or library might have caused side effects in theory. The code that I want to execute must have guarantees of being side-effects-free, so using isinstance does not give me that. Currently, I have worked around this by using a custom, pure Python implementation of isinstance that gets __mro (via object.__getattribute) and checks that type(obj) is an element of it (plus caching, for performance reasons).

ethanfurman commented 2 years ago
$ python3
Python 3.8.10 (default, Sep 28 2021, 16:10:42) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> object
<class 'object'>

>>> import builtins
>>> builtins.object
<class 'object'>

>>> builtins.object = int
>>> object
<class 'int'>

Python is very much a language about responsibility. If Django is overriding __getattribute__ then it is their responsibility to ensure that everything still works properly. If something doesn't, we file a bug report and/or implement a work-around.

As for side-effect free -- I'm not sure than anything in Python is guaranteed to be side-effect free, except maybe is.

There is no bug here, everything is working as intended.

c9f73e94-5cae-4a7c-81f2-f7364bbee5df commented 2 years ago

Python is very much a language about responsibility. If Django is overriding __getattribute__ then it is their responsibility to ensure that everything still works properly.

Perhaps I didn't stress observability enough. A tool like a tracer or a profiler, in the ideal world, is not supposed to perturb the tracee in any way. The current implementation of isinstance, when used by an observability tool, may cause side effects as well as overheads, so it is not safe to use in such tools. Working around it may solve the side-effects issue, but leaves the problem of overheads. Changing the way isinstance works internally might prove beneficial for such tools.

En passant, I think it should be noted that the built-in "type" doesn't suffer from the same problem, i.e.

class Side(object):
    def \_\_getattribute__(self, name):
        ValueError(name)

```py
type(Side())


works as expected.
rhettinger commented 2 years ago

Changing the way isinstance works internally might prove beneficial for such tools.

ISTM you're advocating a design change rather than discussing a bug report. The python-ideas mail list would be a better forum than the tracker.

stevendaprano commented 2 years ago

I agree that the rules regarding type and __class__ and isinstance are not clear or well documented.

We have:

https://docs.python.org/3/library/stdtypes.html#instance.\_\_class__

https://docs.python.org/3/library/functions.html#isinstance

but neither discuss the interaction between a class' "real" type and it's "fake" type.

To be honest, I'm not even sure I fully understand the interaction myself, or how they combine with virtual subclasses. A lot of my information comes from this Stackoverflow post:

https://stackoverflow.com/questions/1060499/difference-between-typeobj-and-obj-class

and in particular this comment:

"Some code does this deliberately to lie about the type of objects, such as weakref.proxy. Some people think obj.__class__ should be preferred, because it believes the lie, while some people think type(obj) should be preferred because it ignores the lie. isinstance will return true if an object's real class or its lie class is an instance of the second argument."

So I think that the behaviour here is correct, but it is not documented well and we should fix that.

What I think happens:

And for the avoidance of doubt, a class is always considered a subclass of itself.

I'm really not sure about the interaction with virtual subclasses, I probably have that bit wrong. And it is not clear to me what happens if __class__ is a non-type object. It seems to be permitted.

Improving the documentation would be excellent.

stevendaprano commented 2 years ago

On Fri, Dec 10, 2021 at 12:03:15AM +0000, Gabriele N Tornetta wrote:

class Side(object): def __getattribute__(self, name): ValueError(name)

You forgot to actually *raise* the exception. That will just instantiate the ValueError instance, and then immediately garbage collect it.

In any case, type() and isinstance() do not work the same way. type() does not inspect the __class__ attribute.

c9f73e94-5cae-4a7c-81f2-f7364bbee5df commented 2 years ago

@rhettinger Apologies. You're totally right but I wasn't expecting this to become a lengthy conversation.

So, to get closer to the initial issue, I believe that this ticket could be closed provided that the documentation is improved by making developers aware of the potential side effects of isinstance. I was doing some more experiments and it looks like

def _isinstance(obj, classinfo):
    return issubclass(type(obj), classinfo)

might be considered a "side-effects-free" version of isinstance. So perhaps one thing to mention in the documentation is that isinstance(obj, classinfo) != issubclass(type(obj), classinfo) in general.

stevendaprano commented 2 years ago

The plot thickens.

I was wrong to say that type() always and only looks at the "real" underlying type of the instance. type() does look at __class__ as well. But only sometimes.

>>> class A:
...     __class__ = int
... 
>>> type(A())
<class '__main__.A'>
>>> a = A()
>>> a.__class__ = int
>>> type(a)
<class '__main__.A'>

So from A, we might think that type() ignores __class__

>>> class B:
...     pass
... 
>>> type(B())  # no __class__ to inspect
<class '__main__.B'>
>>> b = B()
>>> b.__class__ = int
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __class__ assignment only supported for mutable types or ModuleType subclasses

How very odd. But okay, let's try something else:

>>> b.__class__ = A  # lie that this is an A not a B
>>> type(b)
<class '__main__.A'>

So now type() *does* inspect __class__, and believes it.

If we generate the __class attribute dynamically with __getattr, the method doesn't even get called. But if we use __getattribute__:

>>> class C:
...     def __getattribute__(self, name):
...             print('C getattribute:', name)
...             if name == '__class__':
...                     return int
...             raise AttributeError
... 
>>> C().__class__
C getattribute: __class__
<class 'int'>
>>> type(C())
<class '__main__.C'>

type() ignores the dynamically generated __class__. But isinstance does not:

>>> isinstance(C(), int)
C getattribute: __class__
True

The same applies if we use property:

>>> class D:
...     @property
...     def __class__(self):
...             return int
... 
>>> type(D())
<class '__main__.D'>
>>> isinstance(D(), int)
True

So the rules appear to be: