wyfo / apischema

JSON (de)serialization, GraphQL and JSON schema generation using Python typing.
https://wyfo.github.io/apischema/
MIT License
225 stars 18 forks source link

serialization/ deserialization pre/post processing #183

Open CaptainDriftwood opened 3 years ago

CaptainDriftwood commented 3 years ago

I really love this library for its simplicity and sticking as close to the stdlib dataclasses as possible. Is there a plan to have similar functionality to marshmallow's pre and post-processing serialization/deserialzation of json objects, referenced here.

wyfo commented 3 years ago

I really love this library for its simplicity and sticking as close to the stdlib dataclasses as possible.

Thank you a lot for this feedback! Sticking to the stdlib was indeed my first motivation to write this library, so I'm glad you like it.

Is there a plan to have similar functionality to marshmallow's pre and post-processing serialization/deserialzation of json objects, referenced here.

Actually, I don't really see a use case for similar pre/post-processing feature in apischema. When I look to marshmallowdocumentation, examples are "enveloping" and validation. Both of them are quite covered by apischema current features.

If validation is well detailed in apischema documentation, enveloping pattern can be "simply" implemented using conversions feature:

from dataclasses import dataclass
from apischema import deserialize, serialize

@dataclass
class User:
    name: str

@dataclass
class UserEnvelop:
    user: User

def unwrap_user(user: UserEnvelop) -> User:
    return user.user

def wrap_user(user: User) -> UserEnvelop:
    return UserEnvelop(user)

assert deserialize(User, {"user": {"name": "steve"}}, conversion=unwrap_user) == User("steve")
assert serialize(User, User("steve"), conversion=wrap_user) == {"user": {"name": "steve"}}

However, it's true that enveloping as default conversion is more tricky, because it's kind of recursive (User is deserialized from an "envelopped" User). But it can be implemented too, using registered conversions bypass:

from dataclasses import dataclass, field
from apischema import deserialize, deserializer, serialize, serializer
from apischema.conversions import identity
from apischema.metadata import conversion

@dataclass
class User:
    name: str

@dataclass
class UserEnvelop:
    # Bypass recursive conversion in envelop field conversion with identity
    user: User = field(
        metadata=conversion(deserialization=identity, serialization=identity)
    )

@deserializer
def unwrap_user(user: UserEnvelop) -> User:
    return user.user

@serializer
def wrap_user(user: User) -> UserEnvelop:
    return UserEnvelop(user)

assert deserialize(User, {"user": {"name": "steve"}}) == User("steve")
assert serialize(User, User("steve")) == {"user": {"name": "steve"}}

And you can also build a more generic envelop wrapper like this:

from dataclasses import dataclass, field, make_dataclass
from typing import Callable, Optional, TypeVar, Union, overload
from apischema import deserialize, deserializer, serialize, serializer
from apischema.conversions import Conversion, identity
from apischema.metadata import conversion

Cls = TypeVar("Cls", bound=type)

@overload
def envelop(cls: Cls, /) -> Cls:
    ...
@overload
def envelop(name: str) -> Callable[[Cls], Cls]:
    ...
def envelop(arg: Optional[Union[type, str]] = None, /, name: Optional[str] = None):
    if isinstance(arg, type):
        field_name = name or arg.__name__.lower()
        bypass_conv = conversion(deserialization=identity, serialization=identity)
        envelop_field = (field_name, arg, field(metadata=bypass_conv))
        envelop_cls = make_dataclass(f"{arg.__name__}Envelop", [envelop_field])
        deserializer(
            Conversion(
                lambda env: getattr(env, field_name), source=envelop_cls, target=arg
            )
        )
        serializer(Conversion(envelop_cls, source=arg))
        return arg
    else:
        assert arg is None or isinstance(arg, str)
        return lambda cls: envelop(cls, name if name is not None else arg)  # type: ignore

@envelop
@dataclass
class User:
    name: str

assert deserialize(User, {"user": {"name": "steve"}}) == User("steve")
assert serialize(User, User("steve")) == {"user": {"name": "steve"}}

That being said, as I wrote at the beginning, I've no use case in mind where something like marshmallow feature would be needed, i.e. where apischema features, especially conversions, are not sufficient to cover the case. Could you give me more details about your need?

But there is an important thing to consider: apischema not only (de)serialize types, but also generates their JSON/GraphQL schema. When you define a conversion, it applies to the (de)serialization operation but also to the generated schema. On my first example, it gives:

from apischema.json_schema import deserialization_schema

assert deserialization_schema(User, conversion=unwrap_user) == {
    "$schema": "http://json-schema.org/draft/2019-09/schema#",
    "type": "object",
    "properties": {
        "user": {
            "type": "object",
            "properties": {"name": {"type": "string"}},
            "required": ["name"],
            "additionalProperties": False,
        }
    },
    "required": ["user"],
    "additionalProperties": False,
}

On the other end, marshmallow pre_load/etc. are dynamic stuff, and not really compatible with apischema philosophy. For the envelop pattern, it would also require to add a dynamic modification of the schema (although this is already possible in apischema with extra schema)