marshmallow-code / marshmallow

A lightweight library for converting complex objects to and from simple Python datatypes.
https://marshmallow.readthedocs.io/
MIT License
7.06k stars 629 forks source link

Object class and schema definition combined, ORM-style #2000

Open tgross35 opened 2 years ago

tgross35 commented 2 years ago

Hello all,

I would like to ask if there has been consideration about adding a model that allows for data storage and serialization/deserialization in one. This is a pretty common use case, avoiding redundancy between defining schemas and the data they produce. This could be done via an ORM-style model like SQLAlchemy has.

Something like this was discussed before (https://github.com/marshmallow-code/marshmallow/issues/1043) but since the recipe is simple and useful, I can see value adding it to marshmallow.

There are some libraries that accomplish similar goals, but they have some drawbacks:

A possible API could look like the following:

from datetime import datetime

from marshmallow import MarshModel, INCLUDE, fields, ValidationError

class UserModel(MarshModel):
    __meta_args__ = {'unknown': INCLUDE}

    name: str = fields.Str()
    email: str = fields.Email()
    created_at: datetime = fields.DateTime()

    @validates('email'):
    def validate_email(self, value: Any) -> None:
        if '@' not in value:
            raise ValidationError('Not an email address!')

class BlogModel(MarshModel):
    title: str = fields.String()
    author: dict = fields.Nested(UserModel)

user_data = {"name": "Ronnie", "email": "ronnie@stones.com"}
user = UserModel.load(user_data)

blog = BlogModel(title="Something Completely Different", author=user)
blog.dump()
pprint(result)
# {'title': 'Something Completely Different',
#  'author': {'name': 'Ronnie',
#             'email': ronnie@stones.org',
#             'created_at': '2021-08-17T14:58:57.600623+00:00'}}

MarshModel would require:

If there is interest, I may be able to submit a PR

tgross35 commented 2 years ago

I went ahead and figured out some of the needed logic. The following works:

class UserModel(MarshModel):
    __meta_args__ = {"unknown": INCLUDE}

    name: str = fields.Str()
    email: str = fields.Email()
    created_at: datetime = fields.DateTime()

class BlogModel(MarshModel):
    title: str = fields.String()
    author: UserModel = MMNested(UserModel)

user_data = {"name": "Ronnie", "email": "ronnie@stones.com"}
user = UserModel().load(user_data)

blog = BlogModel(title="Something Completely Different", author=user)
result = blog.dump()
pprint(result)

# {'author': {'created_at': None, 'email': 'ronnie@stones.com', 'name': 'Ronnie'},
#  'title': 'Something Completely Different'}

From the main logic here:

class MarshModel:
    _ma_schema: Schema
    _field_names: list[str]
    __meta_args__: dict[str, typing.Any] = {}

    def __init_subclass__(cls, **kw) -> None:
        super(MarshModel).__init_subclass__(**kw)

        cls._field_names = [
            f for f in dir(cls) if isinstance(getattr(cls, f), fields.Field)
        ]
        cls._ma_schema = _get_meta_class(cls.__meta_args__).from_dict(
            {name: getattr(cls, name) for name in cls._field_names}
        )()

        cls.__init__ = _create_init_fn(cls._ma_schema.fields)

        _register_model(cls)

    def load(self, *args, **kw):
        loaded = self._ma_schema.load(*args, **kw)
        for k, v in loaded.items():
            setattr(self, k, v)
        return self

    def dump(self):
        return self._ma_schema.dump(self)

Full relevant file here https://github.com/tgross35/marshmallow-mapper-test/blob/f346dca5ab47dbcf0971eb97d8f185b3c294cf72/mapper.py

Typing is wrong, signatures aren't working right, and hooks don't work, but basic implementation doesn't seem too bad.

tgross35 commented 2 years ago

Let me generalize this a bit: the main goal is to have the return type of .load() be something that can be statically type checked, which helps with IDEs (autocomplete is awesome) but also allows for better validation of code use (e.g. to help catch errors with MyPy). The class implementation here is one way to go about this, but similar results can be created (I think) by dynamically creating a TypedDict for the return type, which has the same benefits

kyleposluns commented 2 years ago

This is the most desired marshmallow feature among my colleagues.

likeyiyy commented 2 years ago

I want this feature too

terra-alex commented 1 year ago

Hello! What do you think of using Desert as a workaround at the moment? It seems like it accomplishes most of this

(with some issues dealing with things like marshmallow-oneofschema, but as far as I could tell that was my only issue with it)