marcosschroh / dataclasses-avroschema

Generate avro schemas from python classes. Code generation from avro schemas. Serialize/Deserialize python instances with avro schemas
https://marcosschroh.github.io/dataclasses-avroschema/
MIT License
213 stars 64 forks source link

Handling logical types #707

Closed pj-sp closed 2 weeks ago

pj-sp commented 1 month ago

Is your feature request related to a problem? Please describe. I have a case where I'd like to use ModelGenerator to generate model from avro schema, however I noticed some issues. Let's say that I have schema defined as below:

{
  "type": "record",
  "name": "TestEvent",
  "namespace": "com.example",
  "doc": "Bla bla",
  "fields": [
    {
      "name": "occurredAt",
      "type": {
        "type": "long",
        "logicalType": "timestamp-millis"
      },
      "doc": "Event time"
    },
    {
      "name": "previous",
      "type": {
        "type": "record",
        "name": "urls",
        "doc": "urls",
        "fields": [
          {
            "name": "regular",
            "type": {
              "type": "string",
              "logicalType": "url"
            },
            "doc": "Urls"
          }
        ]
      }
    }
  ]
}

Model generation fails with an exception

  File "/Users/sp/Projects/protocols/src/main/python/build_classes.py", line 19, in <module>
    result = model_generator.render(schema=schema, model_type='AvroModel')
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/generator.py", line 120, in render
    return self.render_module(
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/generator.py", line 150, in render_module
    return model_generator.render(schemas=schemas)
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 76, in render
    result.append(self.render_class(schema=schema))
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 164, in render_class
    fields_representation: typing.List[FieldRepresentation] = [
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 165, in <listcomp>
    self.render_field(field=field, model_name=name) for field in record_fields
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 222, in render_field
    field_representation = self.render_field(field=type, model_name=model_name)
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 244, in render_field
    record = f"\n{self.render_class(schema=field)}"
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 164, in render_class
    fields_representation: typing.List[FieldRepresentation] = [
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 165, in <listcomp>
    self.render_field(field=field, model_name=name) for field in record_fields
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 220, in render_field
    language_type = self.parse_logical_type(field=field)
  File "/Users/sp/.pyenv/versions/3.10.8/lib/python3.10/site-packages/dataclasses_avroschema/model_generator/lang/python/base.py", line 320, in parse_logical_type
    self.imports.add(self.logical_types_imports[logical_type])
KeyError: 'url'

I did another test (same schema, without url) and generated model is

from dataclasses_avroschema import AvroModel
import dataclasses
import datetime

@dataclasses.dataclass
class TestEvent(AvroModel):
    """
    Bla bla
    """
    occurredAt: datetime.datetime = dataclasses.field(metadata={'doc': 'Event time'})

    class Meta:
        namespace = "com.example"

So, occurredAt is defined as datetime (I'm expecting integer here, as field is defined as timestamp-millis -> long.

Describe the solution you'd like Are there any plans to allow:

marcosschroh commented 1 month ago

Hi @pj-sp ,

I will try to add the fallback.

pj-sp commented 1 month ago

@marcosschroh great, thx for the info

kamilglod commented 1 month ago

From Avro docs:

Language implementations must ignore unknown logical types when reading, and should use the underlying Avro type. If a logical type is invalid, for example a decimal with scale greater than its precision, then implementations should ignore the logical type and use the underlying Avro type.