Open tijmenr opened 1 month ago
Another unexpected behavior for convert
:
Given:
>>> class A(msgspec.Struct, rename='pascal'): a_x: int
...
>>> class B(msgspec.Struct, rename='kebab'): a_x: int
...
>>> class C(msgspec.Struct): AX: int
...
The following conversions work, making it seem as though convert
is flexible on whether it uses the "real" field name or the "encoded" one:
>>> msgspec.convert(A(1), B, from_attributes=True) # Apparently uses a_x from A
B(a_x=1)
>>> msgspec.convert(B(1), A, from_attributes=True) # Apparently uses a_x from B
A(a_x=1)
>>> msgspec.convert(C(1), A, from_attributes=True) # Apparently uses AX from C
A(a_x=1)
But the flexibility on "real" or "encoded" is apparently only on the destination type, because the following fails:
>>> msgspec.convert(A(1), C, from_attributes=True) # Cannot use AX from A
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
msgspec.ValidationError: Object missing required field `AX`
And then suddenly when converting not from a Struct, but from a dict object, it only works if the dict contains the "encoded" field names:
>>> msgspec.convert({'AX':1}, A) # Using encoded field name works
A(a_x=1)
>>> msgspec.convert({'a_x':1}, A) # Using real field name fails
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
msgspec.ValidationError: Object missing required field `AX`
>>> # So you cannot do a roundtrip...
>>> msgspec.convert(msgspec.structs.asdict(A(1)), A)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
msgspec.ValidationError: Object missing required field `AX`
Again, this behavior makes it difficult to use msgspec to translate similar json structures between different field name conventions.
Description
The
convert
function does not handleUNSET
values as I expected (or I am missing some detail in how optionality and unset work together). If a field has a union type that allowsmsgspec.UnsetType
, it seems to ignore that during the conversion:The value
a
fits the structB
(as it allows an unsety
field), so I would expect conversion ofa
to an instance ofB
to succeed, with the value of they
field set to eitherUNSET
(as it is ina
) or the defined default inB
(which in this case happens to beUNSET
as well; this would need documentation). However, it fails on this unset value of they
field`:(Trying to "convert"
a
into a new instance of structA
does not yield an error, probably because there is some shortcut logic at play).The same error also occurs when trying to convert a
dict
with an explicit unset value:I currently have a use case where I basically need to translate the field names of a multi-level json structure from kebab-case to PascalCase, and my idea was to create two roughly the same structs K and P (using the handy
rename
option so that each gets the desired encoded field names), decode the source document using struct K,convert
it into struct P, and use that last one for output. However, this issue with unset fields hinders that approach. (I'm going to try and change the default value fromUNSET
toNone
as a workaround, but a side effect is not being able to distinguish between a field just not being present, or being present with anull
value in the source document.)On a side note (not sure if that would be a separate bug or a feature), having a dict as an intermediate step between K and P does not help, because
msgspec.structs.asdict
does not honor theomit_defaults
option of the struct, so it does include the unset fields in the dict, and this meansconvert
can not work on the dict either (unless I first rebuild the dict myself to remove all unset fields).