Fatal1ty / mashumaro

Fast and well tested serialization library
Apache License 2.0
758 stars 45 forks source link

Query string support #172

Closed tkukushkin closed 10 months ago

tkukushkin commented 11 months ago

Is your feature request related to a problem? Please describe.

I would like to use mashumaro to parse query string parameters. Arrays in query strings are usually described by repeating of the same name, like foo=1&foo=2. If I'm not mistaken, mashumaro does not support such structures atm.

I've tried something like this

@dataclass
class Foo(DataClassDictMixin):
    x: float
    y: list[int]

print(Foo.from_dict(multidict.MultiDict([('x', '1.2'), ('y', '2'), ('y', '3')])))

and I was surprised that this value passes validation and deserialises to Foo(x=1.2, y=[2]). I didn't expect it to work properly, but at the same time it's a bit odd because {'x': '1.2', 'y': '2'} doesn't pass the validation.

Describe the solution you'd like

Fatal1ty commented 11 months ago

Hi @tkukushkin

I’m currently on my vacation and will answer you next week. Before that you can read an example of using MultiDict with SerializationStrategy here https://github.com/Fatal1ty/mashumaro#third-party-generic-types

tkukushkin commented 11 months ago

Hi! I've seen this example, but if I'm not mistaken, it is not about reading dataclass from multidict, but about reading some data to multidict. Have a good vacation)

Fatal1ty commented 10 months ago

Ok, I'm here. The first thing I want to say is that different web frameworks have different multidict implementations for working with query string. The package multidict is not the only implementation. So, since there is no standard implementation that would dictate how dunder methods should work and what additional methods should exist, I'm not sure that mashumaro should have native support for multidict structures. However, you can make it work with multidict structures using a pre-deserialize hook. In the following example, I'll show you how to do this with multidict package that has getall method:

from dataclasses import dataclass
from typing import get_type_hints

import multidict

from mashumaro import DataClassDictMixin
from mashumaro.core.meta.helpers import get_type_origin

class DataClassMultiDictMixin(DataClassDictMixin):
    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        list_fields = set()
        for fname, ftype in get_type_hints(cls).items():
            if issubclass(get_type_origin(ftype), list):
            # you can check not only for lists but for other collection types
                list_fields.add(fname)
        cls.__list_fields__ = list_fields

    @classmethod
    def __pre_deserialize__(cls, d):
        # d.copy() can be used if necessary
        for name in cls.__list_fields__:
            d[name] = d.getall(name)
        return d

@dataclass
class Foo(DataClassMultiDictMixin):
    x: float
    y: list[int]

o = Foo.from_dict(multidict.MultiDict([("x", "1.2"), ("y", "2"), ("y", "3")]))
assert o == Foo(x=1.2, y=[2, 3])

The idea is simple. The string contains either a single value or a multiple value. Here we use the dataclass type hints to collect all keys that are supposed to have multiple values and get all the values in the hook. We could use this reflection at runtime in the hook but it's more performant to do it once on dataclass creation.

Fatal1ty commented 10 months ago

@tkukushkin can we close this issue now?

Fatal1ty commented 10 months ago

I’m closing this one. Feel free to reopen if you have further questions.