lidatong / dataclasses-json

Easily serialize Data Classes to and from JSON
MIT License
1.36k stars 153 forks source link

[BUG] schema() fails when there is tuple field. #432

Closed healthmatrice closed 1 year ago

healthmatrice commented 1 year ago

Description

If the dataclass contain list/tuple as field. The schema generation fails.

The output is:

{"x": [1]}
{"y": [1, 2]}

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 23
     21     y: Tuple[int, int]
     22 print(Person2((1,2)).to_json())
---> 23 Person2.schema()

File /venv/lib/python3.10/site-packages/dataclasses_json/api.py:87, in DataClassJsonMixin.schema(cls, infer_missing, only, exclude, many, context, load_only, dump_only, partial, unknown)
     75 @classmethod
     76 def schema(cls: Type[A],
     77            *,
   (...)
     85            partial: bool = False,
     86            unknown=None) -> SchemaType:
---> 87     Schema = build_schema(cls, DataClassJsonMixin, infer_missing, partial)
     89     if unknown is None:
     90         undefined_parameter_action = _undefined_parameter_action_safe(cls)

File /venv/lib/python3.10/site-packages/dataclasses_json/mm.py:361, in build_schema(cls, mixin, infer_missing, partial)
    357         dumped.update(_handle_undefined_parameters_safe(cls=obj, kvs={},
    358                                                         usage="dump"))
    359     return dumped
--> 361 schema_ = schema(cls, mixin, infer_missing)
    362 DataClassSchema: typing.Type[SchemaType] = type(
    363     f'{cls.__name__.capitalize()}Schema',
    364     (Schema,),
   (...)
    368      'dump': dump,
    369      **schema_})
    371 return DataClassSchema

File /venv/lib/python3.10/site-packages/dataclasses_json/mm.py:311, in schema(cls, mixin, infer_missing)
    308 if metadata.letter_case is not None:
    309     options['data_key'] = metadata.letter_case(field.name)
--> 311 t = build_type(type_, options, mixin, field, cls)
    312 # if type(t) is not fields.Field:  # If we use `isinstance` we would return nothing.
    313 if field.type != typing.Optional[CatchAllVar]:

File /venv/lib/python3.10/site-packages/dataclasses_json/mm.py:276, in build_type(type_, options, mixin, field, cls)
    271     warnings.warn(
    272         f"Unknown type {type_} at {cls.__name__}.{field.name}: {field.type} "
    273         f"It's advised to pass the correct marshmallow type to `mm_field`.")
    274     return fields.Field(**options)
--> 276 return inner(type_, options)

File /venv/lib/python3.10/site-packages/dataclasses_json/mm.py:260, in build_type..inner(type_, options)
    257     options["allow_none"] = True
    259 if origin in TYPES:
--> 260     return TYPES[origin](*args, **options)
    262 if _issubclass_safe(origin, Enum):
    263     return EnumField(enum=origin, by_value=True, *args, **options)

File /venv/lib/python3.10/site-packages/marshmallow/fields.py:822, in Tuple.__init__(self, tuple_fields, *args, **kwargs)
    821 def __init__(self, tuple_fields, *args, **kwargs):
--> 822     super().__init__(*args, **kwargs)
    823     if not utils.is_collection(tuple_fields):
    824         raise ValueError(
    825             "tuple_fields must be an iterable of Field classes or " "instances."
    826         )

TypeError: Field.__init__() takes 1 positional argument but 2 were given

It is the same as the https://github.com/lidatong/dataclasses-json/issues/398 But I think mine has far more detail.

Code snippet that reproduces the issue

from dataclasses import dataclass
from dataclasses_json import dataclass_json
from typing import Tuple
import traceback

@dataclass_json
@dataclass
class Person:
    x: List[int]
print(Person([1,]).to_json())
Person.schema()

@dataclass_json
@dataclass
class Person2:
    y: Tuple[int, int]
print(Person2((1,2)).to_json())
Person2.schema()

Describe the results you expected

it should create schema without error

Python version you are using

Python 3.10.6

Environment description

dataclasses-json==0.5.9 marshmallow==3.19.0 marshmallow-enum==1.5.1 mypy-extensions==1.0.0 packaging==23.1 typing-inspect==0.9.0 typing_extensions==4.7.1

healthmatrice commented 1 year ago

if we use Tuple[int] it give a different error:

@dataclass_json
@dataclass
class Person2:
    y: Tuple[int]
print(Person2((1,)).to_json())

Person2.schema()
    822 super().__init__(*args, **kwargs)
    823 if not utils.is_collection(tuple_fields):
--> 824     raise ValueError(
    825         "tuple_fields must be an iterable of Field classes or " "instances."
    826     )
    828 try:
    829     self.tuple_fields = [
    830         resolve_field_instance(cls_or_instance)
    831         for cls_or_instance in tuple_fields
    832     ]

ValueError: tuple_fields must be an iterable of Field classes or instances.

If we use Tuple[int, int, int , int], the error is:

    821 def __init__(self, tuple_fields, *args, **kwargs):
--> 822     super().__init__(*args, **kwargs)
    823     if not utils.is_collection(tuple_fields):
    824         raise ValueError(
    825             "tuple_fields must be an iterable of Field classes or " "instances."
    826         )

TypeError: Field.__init__() takes 1 positional argument but 4 were given

It looks like the problem is in following lines:

        args = [inner(a, {}) for a in getattr(type_, '__args__', []) if
                a is not type(None)]

        if _is_optional(type_):
            options["allow_none"] = True

        if origin in TYPES:
            return TYPES[origin](*args, **options)

When the type_ is Tuple[int, int, int]. We should use fields.Tuple((fields.Int, fields.Int ,fields.Int))

However, the args are [fields.Int, fields.Int ,fields.Int], thus the *args give out fields.Tuple(fields.Int, fields.Int ,fields.Int) which is wrong.

There actually some deeper incompatibility here of this code. Because we can have Tuple[int, ...] for varied length. However, the inner() will give us args = [fields.Int, Ellipsis] I doubt this can be handled by fields.Tuple at all.

Maybe mashmallow does not support varied length tuple at all.

deansg commented 1 year ago

The marshmallow-dataclass library had a similar issue and it was solved there. See: https://github.com/lovasoa/marshmallow_dataclass/issues/63

healthmatrice commented 1 year ago

@deansg thanks for the heads up. So I was correct, mashmallow does have concept of varied length tuple. They simply treat it as marshmallow.fields.List and then do some tricks during deser. This is why they have to subclass it. Seems too much code.

https://github.com/dairiki/marshmallow_dataclass/blob/56c4ebc8fa20d395ef36a76b4b8b280c6c8df288/marshmallow_dataclass/__init__.py#L515

https://github.com/dairiki/marshmallow_dataclass/blob/56c4ebc8fa20d395ef36a76b4b8b280c6c8df288/marshmallow_dataclass/__init__.py#L522

https://github.com/dairiki/marshmallow_dataclass/blob/56c4ebc8fa20d395ef36a76b4b8b280c6c8df288/marshmallow_dataclass/collection_field.py#L6

healthmatrice commented 1 year ago

actually included the Tuple[T, ...] fix