marcosschroh / dataclasses-avroschema

Generate avro schemas from python classes. Code generation from avro schemas. Serialize/Deserialize python instances with avro schemas
https://marcosschroh.github.io/dataclasses-avroschema/
MIT License
213 stars 64 forks source link

Regression: deep avro models no longer converted to dict #324

Closed joaoe closed 1 year ago

joaoe commented 1 year ago

Describe the bug Bug introduced here https://github.com/marcosschroh/dataclasses-avroschema/commit/9d107086944f41a16a0d3238fcb9a65760b0ed50

Deeply nested Avro Models no longer recursively convert to PODs when calling AvroModel.asdict()

To Reproduce

This shall pass.

class MyEnum(enum.IntEnum):
    x = 1

class ModelA(AvroBaseModel):
    a: int = 1
    d: MyEnum = MyEnum.x

class ModelB(AvroBaseModel):
    b: ModelA = ModelA()
    d: MyEnum = MyEnum.x

target = repr({'b': {'a': 1, 'd': 1}, 'd': 1})
res_asdict =  repr(ModelB().asdict())
res_dict =  repr(ModelB().dict())
assert res_asdict  == target, res_asdict 
assert res_dict  == target, res_dict # <- this failed before, by design ? 
  1. The asdict() call fails to traverse to nested models, but at least unpacks the enum at the top level.
  2. The dict() call works from before and follows nested models, but it fails to convert the enum already before this regression was introduced.
  3. Before the regression, asdict() internally called dict()+AvroModel.standardize_custom_type() which both converted nested objects and unpacked enums. I need that for my project :) thanks!

Expected behavior Here https://github.com/marcosschroh/dataclasses-avroschema/blob/5e26fb3315351fafd6b368b759739f6ef8e59119/dataclasses_avroschema/schema_generator.py#L162

That function misses a check for

        if isinstance(value, AvroModel):
            return AvroModel.standardize_custom_type(value.asdict())

PS: could perhaps dict() unpack enums as well, please ? But as far as I see, that call is 100% in pydantic, so either you'd need to wrap it or find a solution inside pydantic. See use_enum_values in https://docs.pydantic.dev/latest/usage/model_config/

marcosschroh commented 1 year ago

Hi @joaoe

Thansk for reporting this issue. I will fix it asap. Regarding unpacking enums I think it is not a good idea as I would like to have the same default behavior as pydantic

joaoe commented 1 year ago

@marcosschroh Hi. Thanks for the fix, but unfortunately it is incomplete. Your patch fixes so that nested objects are traversed if they are AvroModels. Well, that still skips for other container types, like list. e.g.

import enum

from dataclasses_avroschema.avrodantic import AvroBaseModel

class MyEnum(enum.IntEnum):
    x = 1

class ModelA(AvroBaseModel):
    a: int = 1
    d: MyEnum = MyEnum.x

class ModelB(AvroBaseModel):
    b: ModelA = ModelA()
    d: MyEnum = MyEnum.x
    e: list[ModelA] = [ModelA()]
    f: dict[str, ModelA] = {"g": ModelA()}

target = repr({'b': {'a': 1, 'd': 1}, 'd': 1, 'e': [{'a': 1, 'd': 1}], 'f': {'g': {'a': 1, 'd': 1}}})
res_asdict =  repr(ModelB().asdict())
assert res_asdict  == target, res_asdict

I tried the fix I suggested above in my local setup because it was still in schema_generator with AvroModel in scope and it worked for my project where I had lists of objects. But then I did a reduced testcase and I did not think of puting ModelA() inside nested lists.

michal-rogala commented 1 year ago

@joaoe - good news - we have a fix for your problem: https://github.com/marcosschroh/dataclasses-avroschema/pull/346