mkdocstrings / griffe

Signatures for entire Python programs. Extract the structure, the frame, the skeleton of your project, to generate API documentation or find breaking changes in your API.
https://mkdocstrings.github.io/griffe
ISC License
303 stars 41 forks source link

bug: Descriptors are not handled properly #328

Open pekkaklarck opened 4 hours ago

pekkaklarck commented 4 hours ago

If I use a descriptor as at the end of this description, there are two problems with how the generated documentation for Example.count looks like:

  1. There's no type information. It would be possible to get that from the __get__ method.
  2. The descriptor gets labels class-atribute and instance-atribute which are both somewhat wrong. Technically it is a class attribute, but so are methods and propertys (that are also implemented as descriptors) and they don't get such a label either. Technically it's not an instance attribute, but from the usage perspective it acts like one. I believe a proper label would be data-descriptor, but property would probably be more familiar for normal users and the functionality is the same.

You can see how the example is rendered here in a demo site that I've used for MkDocs experiments.

A variation of this is using a descriptor as a method decorator like the setter we use extensively with the Robot Framework project to avoid using normal propertys with dummy getters. Such attributes are currently shown as methods without any type information. You can see an example here in a site where we are creating our new Manual that will also incorporate API docs.

from typing import overload, Self

class PositiveInteger[T]:

    def __set_name__(self, owner: T, name: str):
        self.name = '_' + name

    @overload
    def __get__(self, instance: None, owner: type[T]) -> Self:
        ...

    @overload
    def __get__(self, instance: T, owner: type[T]) -> int:
        ...

    def __get__(self, instance, owner) -> int | Self:
        if instance is None:
            return self
        return getattr(instance, self.name)

    def __set__(self, instance: T, value: int):
        if value <= 0:
            raise ValueError(f'{value} is not positive')
        setattr(instance, self.name, value)

class Example:
    """Example using descriptor."""

    count = PositiveInteger()
    """Count as a positive integer."""

    def __init__(self, count: int = 1):
        self.count = count

I'm not sure should this have been submitted to griffe instead or should I have submitted separate issues about different problems. Feel free to move or split as needed.

pawamoy commented 3 hours ago

(I'll move this to Griffe)

Thanks for the report!

There's no type information. It would be possible to get that from the get method.

Hmmm. I don't think I want to enter this territory. Griffe has some capabilities to follow annotations, but inferring them is currently a non-goal. Even looking at the code, it took me a few seconds to know what the type would be (int, IIUC? I have to admit I'm not familiar with descriptors :sweat_smile:).

If count is therefore supposed to be an int, then I'd recommend annotating it as such. I know it's not ideal, when type-checkers are able to infer the type, but as mentioned on Gitter/Matrix, Griffe is not a type-checker and doesn't try to be one, so it expects that users provide the relevant information. Do you think that makes sense?

The descriptor gets labels class-atribute and instance-atribute [...] from the usage perspective it acts like one.

Griffe indeed focuses on the usage perspective, though it's true that sometimes it can provide info that is not relevant to usage (like, a writable property just becomes a regular attribute, by usage, so why showing the labels, right). In that case, it considers count to be both a class-attribute and instance-attribute because it's defined both at the level class and in the __init__ method. Since I'm not familiar with descriptors: can it actually be used both with Example.count and example.count? In any case, the object should definitely have a data-descriptor label, for exploitation purposes.

To support your setter, you'll have to write an extension :slightly_smiling_face:

pekkaklarck commented 1 hour ago

I don't think it would be unrealistic to get the correct type from the __get__ method. It would then work with all descriptors, including property that is a descriptor but many tools (including Griffe I assume) just handle is as a special construct. I have no idea how easy this would be, but I think it would be a good goal.

You can access descriptors both via the class and via the instance, but you get different results. In the former case you get the descriptor itself, but when you access it from the instance you get the value returned but __get__. Here's an example using the custom PositiveInteger description as well as a property and a method that are also descriptors:

>>> class Example:
...     d = PositiveInteger()
...     def __init__(self):
...         self.d = 1
...     @property
...     def p(self):
...         return 2
...     def m(self):
...         return 3
...         
>>> e = Example()
>>> e.d
1
>>> e.p
2
>>> e.m
<bound method Example.m of <__main__.Example object at 0x7f2780d8a3c0>>
>>> Example.d
<PositiveInteger object at 0x7f2780e29bd0>
>>> Example.p
<property object at 0x7f2780c81a80>
>>> Example.m
<function Example.m at 0x7f2780c7d8a0>

If you want to learn more about descriptors, I recommend reading the excellent Descriptor Guide. Descriptors aren't used that widely in normal code, but Python itself uses them extensively. I'm sure their usage gets more widespread in the future, though. For example, I read somewhere that SQLAlchemy has started to used from in its models.

I would expect that our setter could be handled as a normal descriptor without a dedicated extension. It's easy to recognize that an attribute is a descriptor with hasattr(attr, '__get__') and from there it ought to be possible to handle all of them the same way. I'm interested to look at that myself, but unfortunately I don't have time for that in the near future.

pekkaklarck commented 1 hour ago

Few more examples using the Example class in the above comment:

>>> Example.d.__get__(e, type(e))     # Standard descriptor call requires `instance` and `type`.
1
>>> Example.d.__get__(None, type(e))  # `None` as `instance` is same as accessing from class.
<descriptor.PositiveInteger object at 0x7f2780e29bd0>
>>> Example.p.__get__(e)              # With property `type` is optional.
2
>>> Example.m.__get__(e)()            # Somewhat strange way to call a method...
3
>>> from typing import get_type_hints
>>> get_type_hints(Example.d.__get__)
{'return': typing.Union[int, typing.Self]}

As the end of the example demonstrates, getting the type from __get__ is easy. There are two problems, but both ought to be relatively easy to handle:

pawamoy commented 32 minutes ago

It would then work with all descriptors, including property that is a descriptor

I would expect that our setter could be handled as a normal descriptor without a dedicated extension. It's easy to recognize that an attribute is a descriptor with hasattr(attr, 'get')

You seem to assume dynamic analysis here, with access to runtime objects. But by default, Griffe uses static analysis. I'm sure you understand it's another story :smile: Griffe also supports dynamic analysis though, but I'm not sure we explicitly support data descriptors (we do support method descriptors).