lovasoa / marshmallow_dataclass

Automatic generation of marshmallow schemas from dataclasses.
https://lovasoa.github.io/marshmallow_dataclass/html/marshmallow_dataclass.html
MIT License
456 stars 78 forks source link

Enum parameters are ignored when loading a schema #228

Closed therice closed 1 year ago

therice commented 1 year ago

https://github.com/lovasoa/marshmallow_dataclass/blob/cddce9a4d75003297d8697ca74cb66ab9a6c7be1/marshmallow_dataclass/__init__.py#L720

If an Enum field specifies named parameters in the declaration, these are silently dropped when loading the associated schema.

Test case which demonstrates the problem

import inspect
import unittest
from enum import Enum
from marshmallow import fields
from dataclasses import dataclass

from marshmallow_dataclass import (
    field_for_schema,
    class_schema,
)

class Version(Enum):
    v1: str = "foo/v1"
    v2: str = "foo/v2"

@dataclass
class VersionContainer:
    version : Version = field_for_schema(Version, metadata={'by_value' : True})

class TestEnumFieldForSchema(unittest.TestCase):
    def assertFieldsEqual(self, a: fields.Field, b: fields.Field):
        self.assertEqual(a.__class__, b.__class__, "field class")

        def attrs(x):
            return {
                k: f"{v!r} ({v.__mro__!r})" if inspect.isclass(v) else repr(v)
                for k, v in x.__dict__.items()
                if not k.startswith("_")
            }

        self.assertEqual(attrs(a), attrs(b))

    def test_enum(self):
        self.maxDiff = None
        mdc = field_for_schema(Version, metadata={'by_value' : True})
        mc = fields.Enum(Version, required=True, by_value=True)
        self.assertFieldsEqual(mc, mdc)

    def test_load_enum_schema(self):
        schema = class_schema(VersionContainer)()
        schema.load({
            "version": "foo/v1"
        })

    def test_dump_enum_schema(self):
        schema = class_schema(VersionContainer)()
        vc = VersionContainer()
        vc.version = Version.v1
        self.assertEqual({'version': 'foo/v1'}, schema.dump(vc))

if __name__ == "__main__":
    unittest.main()

Results of execution


python --version
Python 3.10.7

E
======================================================================
ERROR: test_load_enum_schema (__main__.TestEnumFieldForSchema)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/src_dir/test.py", line 73, in test_load_enum_schema
    schema.load({
  File "/home/user/.virtualenvs/venv/lib/python3.10/site-packages/marshmallow_dataclass/__init__.py", line 759, in load
    all_loaded = super().load(data, many=many, **kwargs)
  File "/home/user/.virtualenvs/venv/lib/python3.10/site-packages/marshmallow/schema.py", line 722, in load
    return self._do_load(
  File "/home/user/.virtualenvs/venv/lib/python3.10/site-packages/marshmallow/schema.py", line 909, in _do_load
    raise exc
marshmallow.exceptions.ValidationError: {'version': ['Must be one of: v1, v2.']}

======================================================================
FAIL: test_dump_enum_schema (__main__.TestEnumFieldForSchema)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/src_dir/test.py", line 81, in test_dump_enum_schema
    self.assertEqual("{'version': 'foo/v1'}", schema.dump(vc))
AssertionError: "{'version': 'foo/v1'}" != {'version': 'v1'}

----------------------------------------------------------------------
Ran 3 tests in 0.010s

FAILED (failures=1, errors=1)   

Similar to #199, but after the change to replace the usage of marshmallow-enum in v8.5.11

therice commented 1 year ago

Workaround appears to declare the field using dataclasses directly instead of marshmallow_dataclass

...
from dataclasses import dataclass, field
...
@dataclass
class VersionContainer:
    version : Version = field(default=None, metadata={'by_value' : True})
otonnesen commented 1 year ago

I think the issue is in this line:

    version : Version = field_for_schema(Version, metadata={'by_value' : True})

Fields are intended to be passed as annotations for the dataclass's fields, not as defaults for them. marshmallow-dataclass sees version : Version and makes a field called "version" and binds to it a marshmallow field constructed from the Version Enum (i.e. no custom metadata), then sets the default for that marshmallow field to field_for_schema(Version, metadata={'by_value' : True}) . We can see this behaviour by not passing any value for version when loading:

>>> @dataclass
... class VersionContainer:
...     version : Version = field_for_schema(Version, metadata={'by_value' : True})
...
>>> class_schema(VersionContainer)().load({})
VersionContainer(version=<fields.Enum(dump_default=<marshmallow.missing>, attribute=None, validate=None, required=True, load_only=False, dump_only=False, load_default=<marshmallow.missing>, allow_none=False, error_messages={'required': 'Missing data for required field.', 'null': 'Field may not be null.', 'validator_failed': 'Invalid value.', 'unknown': 'Must be one of: {choices}.'})>)

The workaround you posted is just the standard/intended way to pass in extra field metadata when using this library (see https://github.com/lovasoa/marshmallow_dataclass#customizing-generated-fields). Another option if you wish to avoid this field syntax could be to use marshmallow_dataclass.NewType instead, eg:

>>> VersionField = marshmallow_dataclass.NewType("Version", Version, field=marshmallow.fields.Enum, enum=Version, by_value=True)
>>> @dataclass
... class VersionContainer:
...     version: VersionField
...
>>> class_schema(VersionContainer)().load({"version": "foo/v1"})
VersionContainer(version=<Version.v1: 'foo/v1'>)

Hope that helps! Let me know if I misunderstood the problem.

dairiki commented 1 year ago

Test case which demonstrates the problem

...
@dataclass
class VersionContainer:
    version : Version = field_for_schema(Version, metadata={'by_value' : True})

This is incorrect usage of field_for_schema. (In fact direct use of field_for_schema is not documented.)

Workaround appears to declare the field using dataclasses directly instead of marshmallow_dataclass

from dataclasses import dataclass, field
...
@dataclass
class VersionContainer:
    version : Version = field(default=None, metadata={'by_value' : True})

This is not so much a workaround, as closer to correct usage.

Even more correct, assuming that you do not want to specify a default value for VersionContainer.version would be:

@dataclass
class VersionContainer:
    version : Version = field(metadata={'by_value' : True})

I don't think this is a bug. (The immediately preceding example appears to work fine for me.) Closing. Feel free to reopen if the situation warrants.