python / cpython

The Python programming language
https://www.python.org
Other
62.69k stars 30.06k forks source link

Extract args from `operator.attrgetter("xxx")` #113322

Open Conchylicultor opened 9 months ago

Conchylicultor commented 9 months ago

Feature or enhancement

Proposal:

Given a operator.attrgetter instance, I would like to recover which attribute will be accessed. Something like:

import operator

op = operator.attrgetter('a')

assert op.__args__ == 'a'  # << How to do this ?

op = operator.attrgetter('a', 'b')

assert op.__args__ == ('a', 'b')  # << How to do this ?

After inspecting the object, it doesn't seems possible to recover the argument passed:

image

Currently hack includes parsing repr(op) or applying op(my_dummy_obj) to a custom object which store the __getattr__ access.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

brianschubert commented 9 months ago

FWIW, it's currently possible to recover the args from an operator.attrgetter via its __reduce__ method:

>>> operator.attrgetter('a').__reduce__()
(<class 'operator.attrgetter'>, ('a',))

>>> operator.attrgetter('a', 'b').__reduce__()
(<class 'operator.attrgetter'>, ('a', 'b'))
Conchylicultor commented 9 months ago

Oh, that's what I was looking for. Seems a little hacky but fine enough. Sorry for the noise

TeamSpen210 commented 9 months ago

That does work, but __reduce__()'s result really isn't public. It could change at any time, so it'd be better to have an actual public API for this.

Eclips4 commented 9 months ago

That does work, but __reduce__()'s result really isn't public. It could change at any time, so it'd be better to have an actual public API for this.

You're right. Let's reopen this then.

serhiy-storchaka commented 9 months ago

Why do you need this?

Conchylicultor commented 9 months ago

Why do you need this?

I'm working on a config system based on https://github.com/google/ml_collections. They have a system of lazy-reference (FieldReference) to connect various parts of the config together (e.g. changing one value change another value somewhere else). This is done by storing the ops so they can be applied later on. Like: https://github.com/google/ml_collections/blob/20f226ef4e671e0567ca6155dd99af361a4ddfd2/ml_collections/config_dict/config_dict.py#L412

I would like to serialize to json the FieldReference, so I need to somehow extract the ops from the FieldReference instances, so I can deserialize them later. Something like:

f1 = ml_collections.FieldReference()
f2 = f1.some_attribute

assert f2._ops = operator.attrgetter('some_attribute')

assert serialize_ref(f2) == {
    'id': 1234,
    'ops': {
        'op_name': 'operator.attrgetter',
        'arg': 'some_attribute',
    },
}

I'm sure if this was designed from scratch, there would be better way to handle this but I don't really have time to redesign the FieldReference system, so I'm just working with what I already have.

serhiy-storchaka commented 9 months ago

When you implement more or less general serialization protocol, it is common to use the parts of the pickle protocol. __reduce__() returns the constructing function and its arguments, exactly what you need. You can use it right now and do not wait for 3.13.

Now, if in future we add support of keyword arguments in attrgetter, it will change __reduce__() and the code that uses it will immediately fail (if it is strict enough). But if you use custom code that only saves the args attribute, you miss additional kwargs attribute (or whatever can be added) and produce incorrect output which gives wrong object when deserialize it. And you can not notice it for a long time if it only occurs in rare circumstances.

So, while using __reduce__() can be fragile solution, it can be more reliable because it will fail faster.

encukou commented 8 months ago

I support this feature. Historically, Python's objects are very introspectable, which is great for debugging. Yes, future features make past introspection data incomplete, but I don't think that's a terribly big price to pay. Let's name it _attrs, like in the Python version, to emphasize it is meant for debugging.

Question is: does anyone want to hack on the C code? :)