Open HansBrende opened 5 months ago
However, this has the downside that when I am programmatically constructing an instance of this struct, it will no longer fail fast if I've forgotten one of the arguments. I want to be sure that as new fields are added to this struct, I'm not forgetting to add them in my code in other places.
That's an interesting use case. What about using a custom classmethod
as a constructor in the places where you always want to explicitly set each field? Something like:
from __future__ import annotations
import msgspec
class Demo(msgspec.Struct):
field_one: int | None = None
field_two: int | None = None
field_three: int | None = None
@classmethod
def new(cls, *, field_one: int, field_two: int, field_three: int) -> Demo:
return cls(field_one, field_two, field_three)
# elsewhere in your code...
demo = Demo.new(field_one=1, field_two=2, field_three=3)
print(demo)
#> Demo(field_one=1, field_two=2, field_three=3)
# any locations where you forgot to add `field_three` would then error
Demo.new(field_one=1, field_two=2)
When adding a new field to the struct you'd need to remember to also add it to the classmethod, but the close proximity of the two should help you remember. Heck you could even enforce these align with a check at import time via a __init_subclass__
hook if you wanted to (I'm happy to provide an example if this interests you). IMO this is a nice low-tech solution to a code hygiene problem.
Add an
omit_none
option
This also might make sense, but would obviously take more work on my end.
As a meta conversation, I'm now wondering if options like this or omit_defaults
should be set per-call to encode
(or on the Encoder
once) rather than on the type. The logic being that sometimes you might want to encode the full model and sometimes you might want a more compact representation - but these attributes are more specific to the call site than to the type being represented?
@jcrist actually, I already have the class method you speak of, and by "other places in my code" I was referring to this one class method 😆
The __init_subclass__
hook you mention would be interesting indeed, would love to see that example! Provided it adds minimal overhead (I'm instantiating millions of these quite often) I think that could work. All I need is a basic fail-fast sanity check to make sure I'm populating all fields (previously accomplished very smoothly simply by not having defaults set).
I agree with your meta comment... it seems like that would provide more flexibility. Although, for my own use-cases (currently) I personally do not need multiple flavors of serialization. I could see how it would be annoying though if I at some point in the future needed to serialize two different ways.
One possible downside I could see is that to generate the "schema" correctly you'd need to also supply the encoder you use to serialize... as the schema might also depend on encoder arguments. But on the other hand, maybe that is not a downside at all. I think something similar happened in pydantic recently (as far as needing additional arguments to generate schema properly), because FastAPI now generates potentially two different schemas... one "deserialization" schema, and one "serialization" schema... as what is required vs. not changes depending on whether reading or writing.
@jcrist
As a meta conversation, I'm now wondering if options like this or omit_defaults should be set per-call to encode (or on the Encoder once) rather than on the type
as in #549 (my reply here) - I'd say ideal is for the class-based parameter to define a default, but also provide a way to override it in the encoder. But if this is too much work, encoder seems like the best option as it has no downsides but gives finer control.
I just came here to second having an omit_defaults
argument to encode
,
because I totally have a use case for that.
I came here for the same thing. We need this feature for some CRUD operations. A default might be bool=True, but we don't want to update the database with the default in a PATCH scenarios if it is not sent.
Making omit_defaults
and repr_omit_defaults
inheritable might be an alternative to having it as encode
option. Btw. repr_omit_defaults
is currently implemented, but not yet documented.
Making
omit_defaults
andrepr_omit_defaults
inheritable might be an alternative to having it asencode
option.
Both of those options are already inheritable.
import msgspec
class Base(msgspec.Struct, omit_defaults=True, repr_omit_defaults=True):
pass
class Point(Base):
x: int = 0
y: int = 0
print(Point())
#> Point()
print(msgspec.json.encode(Point()))
#> b'{}'
Btw.
repr_omit_defaults
is currently implemented, but not yet documented.
It is, in the API docs
@jcrist Oh, didn't see that. I was looking for it in the Struct
doc here and there is no repr_omit_defaults: https://jcristharif.com/msgspec/structs.html
Description
Problem: I want to be able to omit
None
fields from the json serialization, to make the serialization more compact. Currently, I can do this by settingomit_defaults
toTrue
, and adding a default ofNone
to each field. However, this has the downside that when I am programmatically constructing an instance of this struct, it will no longer fail fast if I've forgotten one of the arguments. I want to be sure that as new fields are added to this struct, I'm not forgetting to add them in my code in other places.There's two ways I see that this could be accomplished, off the top of my head:
Add an
init_omit_defaults
option (similar torepr_omit_defaults
) where, if set toTrue
, the default for each field will not be added to the__init__
method (only added during deserialization).Add an
omit_none
option (similar to fastapi'sresponse_model_exclude_none
option) where the serialization automatically excludesNone
regardless if it is the default or not.Both of these would accomplish exactly what I need.
Other approaches I've considered:
I could do
random_field: Union[FieldType, UnsetType]
, with no default, which would accomplish almost exactly what I'm after. This has two very annoying downsides, however:None
toUNSET
for each fieldUNSET
back toNone
for each field. This would probably lead to more potential logic errors than I was originally trying to avoid, as it is used all over the place and all the existing code assumes thatNone
is the "not present" value.