jpmorganchase / py-avro-schema

Generate Apache Avro schemas for Python types including standard library data-classes and Pydantic data models.
https://py-avro-schema.readthedocs.io/
Apache License 2.0
37 stars 6 forks source link

union schema does not recursively create defaults #70

Closed dada-engineer closed 8 months ago

dada-engineer commented 8 months ago

When I have a union in a model with a default value this value might not be serialisable, e.g. when using a pydantic model as default

example.py

import py_avro_schema as pas
from pydantic import BaseModel, Field

from typing import Union, List
from uuid import UUID

class X(BaseModel):
    ids: List[int] = Field(default_factory=list)

class Y(BaseModel):
    ids: List[float] = Field(default_factory=list)

class Bar(BaseModel):
    baz: Union[int, List[int]] = Field(default_factory=list)
    baz2: List[Union[str, UUID]] = Field(default_factory=list)
    baz3: Union[X, Y] = Field(default_factory=X)

class Foo(BaseModel):
    bar: Bar = Field(default_factory=Bar)

print(pas.generate(Foo))

Error:

Traceback (most recent call last):
  File "/Users/user/workspace/private/py-avro-schema/example.py", line 26, in <module>
    print(pas.generate(Foo))
  File "/Users/user/workspace/private/py-avro-schema/.venv/lib/python3.9/site-packages/memoization/caching/plain_cache.py", line 42, in wrapper
    result = user_function(*args, **kwargs)
  File "/Users/user/workspace/private/py-avro-schema/src/py_avro_schema/__init__.py", line 69, in generate
    schema_json = orjson.dumps(schema_dict, option=json_options)
TypeError: Type is not JSON serializable: X

This is fixable by defining a make_default and calling the make_default of self.schema_items[0], because this is sorted if there is a default and the default schema is inserted into the first list position.

faph commented 8 months ago

Thanks @dada-engineer!

Released as 3.6.0