Fatal1ty / mashumaro

Fast and well tested serialization library
Apache License 2.0
758 stars 45 forks source link

InitVar with no default value #179

Closed subbyte closed 9 months ago

subbyte commented 10 months ago

Description

I edited the test for InitVar and remove the default value of the InitVar. Then the test failed.

What I Did

Here's my entire Python script:

#!/usr/bin/env python
from mashumaro import DataClassDictMixin
from dataclasses import dataclass, InitVar

@dataclass
class DataClass(DataClassDictMixin):
    x: InitVar[int]
    y: int = None

    def __post_init__(self, x: int):
        if self.y is None and x is not None:
            self.y = x

assert DataClass(1).to_dict() == {"y": 1}
assert DataClass.from_dict({"x": 1}) == DataClass(1)

Error:

Traceback (most recent call last):
  File ".../dataclass/b.py", line 14, in <module>
    assert DataClass.from_dict({"x": 1}) == DataClass(1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 17, in __mashumaro_from_dict__
TypeError: DataClass.__init__() missing 1 required positional argument: 'x'

I hope there is no error.

subbyte commented 10 months ago

Another related thing that confuses me:

from mashumaro import DataClassDictMixin
from dataclasses import dataclass, InitVar, field

@dataclass
class DataClass(DataClassDictMixin):
    x: InitVar[int] = 0
    y: int = field(init=False)

    def __post_init__(self, x: int):
        self.y = x + 5

assert DataClass(1).to_dict() == {"y": 6}
print(DataClass.from_dict({"x": 10}))
print(DataClass.from_dict({"y": 10}))

It outputs

DataClass(y=5)
DataClass(y=5)

Does this indicate the from_dict() does not really take the argument?

Fatal1ty commented 9 months ago

Hi @subbyte

InitVar fields are skipped intentionally from version 1.7. As far as I remember, it was made simply to keep the symmetry — by default what you pass to from_dict() should be equal to what you get from to_dict(). However, I agree that it's confusing in your case. Could you give more information about the problem you're solving with InitVar?

subbyte commented 9 months ago

Thanks for your quick reply @Fatal1ty !

I use InitVar basically as a way to customize the constructor of the object. My original plan to create a Source class that has two fields: scheme and path. But to initialize a Source object, people usually give the entire URI like scheme://path, so the class needs to split the uri into two fields to create the Source object. Of course I'd like to use mashumaro for serialization and deserialization.

Here's something in my mind:

@dataclass
class Source(DataClassJSONMixin):
    uri: InitVar[str]
    interface: str = field(init=False)
    datasource: str = field(init=False)

    def __post_init__(self, uri):
        xs = uri.split("://")
        if len(xs) != 2:
            raise Exception(uri)
        else:
            self.interface = xs[0]
            self.datasource = xs[1]

Can we serialize/deserialize such object using mashumaro?

Fatal1ty commented 9 months ago

You can override deserialization in this way:

from mashumaro.types import SerializableType

@dataclass
class Source(DataClassJSONMixin, SerializableType):  # <- add SerializableType
    uri: InitVar[str]
    interface: str = field(init=False)
    datasource: str = field(init=False)

    def __post_init__(self, uri):
        xs = uri.split("://")
        if len(xs) != 2:
            raise Exception(uri)
        else:
            self.interface = xs[0]
            self.datasource = xs[1]

    def _serialize(self) -> Any:
        return self.to_dict()

    @classmethod
    def _deserialize(cls, value: dict[str, str]) -> Any:
        return cls(f"{value['interface']}://{value['datasource']}")  # <- join the parts here

After these modifications Source will be (de)serializable if it's used as a dependency for other types. The drawback is that Source.from_dict itself will still result in an error. If you need to use Source as the root model and you need to be able to deserialize it, I would recommend you to wait for the upcoming release that will add the codecs feature:

from mashumaro.codecs.basic import BasicDecoder
from mashumaro.codecs.json import JSONDecoder

basic_decoder = BasicDecoder(Source)
json_decoder = JSONDecoder(Source)

source = Source("http://localhost")
assert basic_decoder.decode({'interface': 'http', 'datasource': 'localhost'}) == source
assert json_decoder.decode('{"interface": "http", "datasource": "localhost"}') == source
Fatal1ty commented 9 months ago

people usually give the entire URI like scheme://path

Why not let users do both? It will be very easy:

@dataclass
class Source(DataClassJSONMixin):
    uri: InitVar[str | None] = None
    interface: str = ""
    datasource: str = ""

    def __post_init__(self, uri: str | None):
        if uri is None:
            if not interface and not datasource:
                raise ValueError
            return
        xs = uri.split("://")
        if len(xs) != 2:
            raise Exception(uri)
        else:
            self.interface = xs[0]
            self.datasource = xs[1]

obj = Source("http://localhost")
assert Source.from_json(obj.to_json())
subbyte commented 9 months ago

Good idea! Thanks for the suggestion, it works!