python-attrs / cattrs

Composable custom class converters for attrs, dataclasses and friends.
https://catt.rs
MIT License
819 stars 113 forks source link

GenConverter's dict_factory argument has no effect #228

Open brakhane opened 2 years ago

brakhane commented 2 years ago

Description

I was experimenting with ways to modify the unstructured data, and noticed that GenConverter accepts the dict_factory argument, and also passes it to Converter, but effectively ignores it since the generated code hardcodes a dict.

For example

import attrs, cattr

@attrs.define
class Example:
    a: int
    b: int

class ExampleDict(dict):
    def __repr__(self):
        return f"ExampleDict({super().__repr__()})"

    def __setitem__(self, key, value):
        if abs(value) > 1<<52:
            value = str(value)
        super().__setitem__(key, value)

foo = Example(42, 1189998819991197253)

print(cattr.Converter(dict_factory=ExampleDict).unstructure(foo))
# ExampleDict({'a': 42, 'b': '1189998819991197253'})

print(cattr.GenConverter(dict_factory=ExampleDict).unstructure(foo))
# {'a': 42, 'b': 1189998819991197253}

GenConverter returns a dict (and therefore the unchanged values), while Converter returns an instance of ExampleDict, as expected.

Tinche commented 2 years ago

Yeah, that argument is ancient; I think it was added in the first version to mirror attr.asdict. Back then I think it was used to be able to use an OrderedDict, but since nowadays dicts are ordered by default I can't think of a real use case.

brakhane commented 2 years ago

I can't think of a real use case.

The reason why I tried this was because I need some way to programmatically omit some values. The API I communicate with makes a distinction between a value being None/null, or not being there in the JSON representation.

I can somewhat get around it by having the sentinel "Omit" value being the default value and use "omit_if_default", but that would prevent me from assigning other default values I might want to define in some cases, and enable the user of my API to specify "instead of the default value, I want to omit this value in this case". But I can work with "omit_if_default" for the time being.

Another thing I need to do is convert int with an absolute values larger than 1<<52 to strings, because JS can't handle larger integers. Edit: Never mind, this can be done with unstructure hooks.

Of course, both these things can be done by postprocessing the unstructured data, but using a modified dict that does the conversion in place could have been a hacky, but working and slightly faster solution.

That being said, I'd prefer to have something like a "omit_if" filter as well as some way to change the behavior of unstructuring based on value, not type. But if that is too niche, making dict_factory work could enable an (admittedly ugly) workaround.

Tinche commented 2 years ago

Hm, all right. I'm still trying to figure out if the dict factory argument is worthwhile and should be restored or should be deprecated. So let's see if we can cover your use case in other ways.

To address your point directly: you can wrap the resulting dict in anything you like by overriding the attrs unstructure hook factory:

from collections import OrderedDict

from attrs import define, has

from cattrs import GenConverter
from cattrs.gen import make_dict_unstructure_fn

converter = GenConverter()

def custom_dict_factory(cls):
    unstructure_fn = make_dict_unstructure_fn(cls, converter)
    return lambda inst: OrderedDict(unstructure_fn(inst))

converter.register_unstructure_hook_factory(has, custom_dict_factory)

@define
class Test:
    a: int

print(converter.unstructure(Test(1)))
>>> OrderedDict([('a', 1)])

You can even just do post-processing in this hook, you don't actually have to change the type.

Now, programmatically omitting some values can also be done by overriding the unstructure hook factory. Can you give me an example of how you signal that some fields need to be deleted from the resulting dictionary?

For example, here's an example of an unstructure hook factory that omits fields if they start with an underscore:

converter = GenConverter()

def omit_fields_factory(cls):
    overrides = {
        a.name: override(omit=True)
        for a in fields(cls)
        if a.name.startswith("_")
    }
    unstructure_fn = make_dict_unstructure_fn(cls, converter, **overrides)
    return unstructure_fn

converter.register_unstructure_hook_factory(has, omit_fields_factory)

@define
class Test:
    a: int
    _b: int

print(converter.unstructure(Test(1, 2)))

Does this get you close?

brakhane commented 2 years ago

Hm, all right. I'm still trying to figure out if the dict factory argument is worthwhile and should be restored or should be deprecated.

I don't think it's worthwhile to keep it. I was just trying it because it was there, and it was pretty hacky.

To address your point directly: you can wrap the resulting dict in anything you like by overriding the attrs unstructure hook factory:

Yes, but this would mean creating the dictionary, and then wrapping/transforming it afterwards, which I tried to avoid. But it's no big deal.

Can you give me an example of how you signal that some fields need to be deleted from the resulting dictionary?

The fields need to be omitted based on their value. I could get it done by something like this:

from typing import TypeVar, Union

class VoidType(): pass # real implementation is close to attrs_NothingType
Void = VoidType()

T = TypeVar("T", covariant=True)
Voidable = Union[VoidType, T]

@attrs.define
class Test:
    a: Voidable[int] = Void
    b: Voidable[int] = 42

c = cattrs.GenConverter(omit_if_default=True)

print(c.unstructure(Test(b=99)))    # works, {"b": 99}

data = c.unstructure(Test(a=123, b=Void))  
print(data) # needs postprocessing, {'a': 123, 'b': <__main__.VoidType object at ...>}

def remove_voids(d):
    # omitting recusion here for brevity
    return {k: v for k, v in d.items() if v is not Void}

print(remove_voids(data))    # {"a": 123}

As I said, the "omit_if_default=True" will probably be enough in practise, I just have to remove the information about defaults (some fields in the API will actually be interpreted as a specific value when omitted (some bools are treated as True, others as False, some ints as 1), and I wanted to make it more obvious in the class definitions, but it's not a dealbreaker)