marcosschroh / dataclasses-avroschema

Generate avro schemas from python dataclasses, Pydantic models and Faust Records. Code generation from avro schemas. Serialize/Deserialize python instances with avro schemas.
https://marcosschroh.github.io/dataclasses-avroschema/
MIT License
218 stars 67 forks source link

Name Collision for certain relationship even when using namespace #196

Closed tebartsch closed 2 years ago

tebartsch commented 2 years ago

Describe the bug The avro serialization with fastavro for three classes

fails with

fastavro._schema_common.SchemaParseException: redefined named type: namespace.A

even when using a namespace for A.

To Reproduce (EDITED: changed from using dataclasses_avroschema.avrodantic.AvroBaseModel to dataclasses_avroschema.AvroModel.)

from dataclasses import dataclass
from dataclasses_avroschema import AvroModel

@dataclass
class A(AvroModel):
    class Meta:
        namespace = "namespace"

@dataclass
class B(AvroModel):
    a: A

@dataclass
class C(AvroModel):
    # class Meta:
    #     alias_nested_items = {"a": "A2"}  # Workaround

    b: B
    a: A

if __name__ == "__main__":
    b = B(a=A())
    c = C(b=B(a=A()), a=A())

    b.serialize()
    c.serialize()

Expected behavior The code above should run but fails with the error (i used dataclasses-avroschema==0.30.3).

Traceback (most recent call last):
  File "/home/tbartsch/test/avro_issue.py", line 28, in <module>
    c.serialize()
  File "/home/tbartsch/test/venv/lib/python3.10/site-packages/dataclasses_avroschema/schema_generator.py", line 144, in serialize
    return serialize(self.asdict(), schema, serialization_type=serialization_type)
  File "/home/tbartsch/test/venv/lib/python3.10/site-packages/dataclasses_avroschema/serialization.py", line 20, in serialize
    fastavro.schemaless_writer(file_like_output, schema, payload)
  File "fastavro/_write.pyx", line 785, in fastavro._write.schemaless_writer
  File "fastavro/_schema.pyx", line 118, in fastavro._schema.parse_schema
  File "fastavro/_schema.pyx", line 257, in fastavro._schema._parse_schema
  File "fastavro/_schema.pyx", line 302, in fastavro._schema.parse_field
  File "fastavro/_schema.pyx", line 202, in fastavro._schema._parse_schema
  File "fastavro/_schema.pyx", line 249, in fastavro._schema._parse_schema
fastavro._schema_common.SchemaParseException: redefined named type: namespace.A

Process finished with exit code 1
tebartsch commented 2 years ago

I get a similar but slightly different problem when using the classes

import typing
from dataclasses import dataclass

from dataclasses_avroschema import AvroModel

@dataclass
class S1(AvroModel):
    pass

@dataclass
class S2(AvroModel):
    pass

@dataclass
class A(AvroModel):
    s: typing.Union[S1, S2]

@dataclass
class B(AvroModel):
    # class Meta:
    #     namespace = "namespace_B"  # Workaround

    a: A

@dataclass
class C(AvroModel):
    # class Meta:
    #     namespace = "namespace_C"  # Workaround

    b: B
    a: A

if __name__ == "__main__":
    b = B(a=A(s=S1()))
    c = C(b=B(a=A(s=S1())), a=A(s=S1()))

    b.serialize()
    c.serialize()

In this case the workaround from above using alias_nested_items does not work. I was only able to fix this by using different namespaces for B and C as in this code.

marcosschroh commented 2 years ago

Hi,

Thanks reporting the bug. I will try to fix it asap.