jcrist / msgspec

A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
https://jcristharif.com/msgspec/
BSD 3-Clause "New" or "Revised" License
2.01k stars 59 forks source link

Support decoding empty strings as unset for all field types #685

Open aolesky opened 1 month ago

aolesky commented 1 month ago

Description

I am decoding a message schema that sends an empty string to represent an unset value. As far as I can tell, there is no way to handle this case without creating a custom type and processing in a decode hook. It could also probably be handled with internal fields that are mapped to duplicate external fields in the post_init method. Either way this complicates struct definition and loses some of the performance benefits of the library.

An optional flag to the decoder could make sense for this feature:

class Message(msgspec.Struct, kw_only=True, frozen=True):
    error_msg: str | msgspec.UnsetType = msgspec.UNSET
    value: Decimal | msgspec.UnsetType = msgspec.UNSET

success = b'{"error_msg": "", value: "2.5"}'
failure = b'{"error_msg": "Failed", value: ""}'

print(msgspec.json.decode(success, type=Message, empty_string_as_unset=True))
#> Message(value=Decimal('2.5'))

print(msgspec.json.decode(failure, type=Message, empty_string_as_unset=True))
#> Message(error_msg="Failed")