Closed LonelyVikingMichael closed 5 months ago
Thanks for opening this.
First, standardizing on terminology. Given your field thing_id
above, "thing_id"
is the attribute name, "thingId"
is the renamed name.
Originally from_attributes
was implemented to always use the attribute names rather than the renamed names (so you'd get the behavior you wanted here). I ended up switching it to use the renamed names for "uniformity" in https://github.com/jcrist/msgspec/pull/431/commits/0f32524a6fbe264781800a4b7b33832200c98d97. Looking back I'm not sure if this was the best decision. Some random thoughts:
There are 3 kinds of things that may be converted into a struct/dataclass from by convert
:
dict
. This is assumed to have come from some serialization protocol (the main point of msgspec). The renamed fields are always used, assuming that the renaming is done to match field naming conventions in that protocol.from_attributes=True
. The assumption is this is also some ORM-like DB-related record (from this request). Since this is by attribute the attributes must be valid attribute names - perhaps these should always use the attribute names? On the other hand diverging from 2 above which has the same use case may be confusing.A few options:
populate_by_name
, but would probably be a kwarg to convert
rather than an option on the struct type. Would this apply to dicts and mappings as well? I'm not sure. I'm reluctant to add a new option here, but could if needed.by_name=True
or something. This wouldn't work well for nested types if there are realistic use cases where you might want to use the original names for some objects and the renamed names for others. Like the above, I'm reluctant to add a new config option here.Thoughts?
Hi Jim, thanks for taking the time to provide context.
In my own use cases (mostly REST APIs), I try to have everything internal to my application use snake_case, whereas the public JSON schemas use camelCase. However, this linked issue comment outlines that this isn't always the case.
As for solutions, having convert
falling back on attribute names or vice versa would be fine by me, though this might be a problem in applications where performance is critical? There's probably no "right" choice between attribute names first or renamed names first, we can make valid arguments for both.
For what it is worth, Pydantic's by_alias
/populate_by_name
options have been helpful to me in the past, especially when integrating to third parties where naming is pre-defined, but I respect your stance on yet another config option.
This is fixed by #636. I ended up going for the inference option - in practice the inference is pretty efficient and only results in at most 1 unnecessary getattr
call. I opted to go for prioritizing the attribute names over the renamed names here, since in most cases converting an arbitrary object by attribute will want to use the attribute names.
Description
Some example code using a
Struct
withrename="camel"
:Raises:
Is there a way around this? For this specific example, we can leverage unpacking the dataclass'
asdict()
result, but this isn't always straightforward with an ORM instance. I actually discovered this behaviour using a SQLAlchemy model.