jcrist / msgspec

A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
https://jcristharif.com/msgspec/
BSD 3-Clause "New" or "Revised" License
2.36k stars 72 forks source link

Support optional lists of included or excluded fields when calling `msgspec.structs.asdict()` #743

Open bdoms opened 2 weeks ago

bdoms commented 2 weeks ago

Description

Imagine that I have a user being created from a web form, and it's defined like:

class User(Struct):
    first_name: str
    middle_name: str
    last_name: str
    username: str
    title: str
    favorite_color: str
    email: str
    password: str

Now, I don't need to do much data manipulation before saving this to the database via my ORM. But I do need to hash the password first, and in fact I'm storing both that and the email in a different schema and table just to be safe. Currently I have to do something like:

user = User(...) # assume this is a valid User instance

data = msgspec.structs.asdict(user)
del data['email']
del data['password']

user_id = DatabaseUser.create(**data)

AuthUser.create(user_id=user_id, email=user.email, password=hash(user.password))

That's not the end of the world for small numbers of fields, but it can get much worse and very cumbersome to manage if there are lots of fields being split into many different places. And sometimes forgetting a del can be catastrophic.

Other frameworks like Pydantic solve this by having both inclulde and exclude lists as options: https://docs.pydantic.dev/2.9/api/base_model/#pydantic.BaseModel.model_dump

I think msgspec getting a similar feature would be highly useful:

data = msgspec.structs.asdict(user, exclude=('email', 'password'))

Note that this may be related to https://github.com/jcrist/msgspec/issues/199 depending on whether or not setting the private=True flag being talked about over there would effect the behavior of asdict(). Personally, I'd prefer keeping them separate since when I'm calling asdict it's usually with a completely different intention than encode/decode.

tijmenr commented 1 week ago

If additional options are being considered for msgspec.structs.asdict: Given that the fields in a struct have a "real" name (name), but can also have a different name used for encoding/decoding (encode_name), another useful option would be a flag to choose between using the "real" field names as dict keys (which is what asdict does currently, so should be the default), or the "encode" names.