Open intentionally-left-nil opened 5 months ago
In case it's helpful, here's my current workaround.
@classmethod
def enc_hook(cls, obj: Any) -> Any:
if isinstance(obj, MappingProxyType):
return obj.copy()
else:
raise NotImplementedError(f"Unknown type: {type(obj)}")
@classmethod
def dec_hook(cls, type: Type, obj: Any) -> Any:
if type is MappingProxyType or get_origin(type) is MappingProxyType:
args = get_args(type)
if len(args) == 2:
key_type: Any = args[0]
value_type: Any = args[1]
return MappingProxyType(msgspec.convert(obj, Dict[key_type, value_type]), dec_hook=cls.dec_hook)
else:
return MappingProxyType(msgspec.convert(obj, dict, dec_hook=cls.dec_hook))
raise NotImplementedError(f"Unknown type: {type(obj)}")
Thanks for opening this. Your workaround as posted is how I'd handle this today.
For encoding I'd expect only the slightest of speedups if we supported this natively. MappingProxyType
doesn't expose a native API, so the only difference is the call to .copy()
would be made a bit quicker. Supporting this for encoding builtin is very easy to do though.
For decoding, native support could be made quicker since we could avoid the copies and 2nd traversal done by calling msgspec.convert
. Supporting this for decoding is less easy (there's more plumbing needed here) but still doable.
That said, types.MappingProxyType
is a fairly uncommon type to use. Adding additional builtin types increases the maintenance burden on msgspec, generally we only add types that are common or can be handled significantly faster when supported as builtins.
Can you say more about why you're trying to use a MappingProxyType
? Due to how they're implemented, MappingProxyType
objects will always be slower to create, access, encode, and decode. The only thing they give you is pseudo-immutability (and then only if the proxied dict isn't accessible elsewhere).
Hi @jcrist, thanks for taking a look.
Adding additional builtin types increases the maintenance burden on msgspec, generally we only add types that are common or can be handled significantly faster when supported as builtins.
Given this, I think it's completely reasonable to not implement this suggestion. I also can't make any arguments that MappingProxyType
is widely used, I only saw it in some relatively obscure posts about immutability.
Can you say more about why you're trying to use a MappingProxyType?
Once-deserialized, I'm using Immutability to detect changes to a nested data structure. E.g. I don't need to use ==
, and have that recurse down the whole tree, but instead I can use object identity at the root level (or any sub-leaf I'm interested in). It's also trivial for me to implement an onchange
detection, because I only need to override the setter property at the root level, and that will catch any change anywhere in the entire tree.
That requires callers to remember to create new copies rather than updating existing ones when making changes, hence the requirement for immutability.
There probably are other options, such as subclassing UserDict
and overriding __setitem__
but then the static type checker is blissfully unaware. MappingProxyType seemed to be the simplest solution. The only challenge I ran into was that serializing the structure to disk with msgspec
(and then re-parsing it) doesn't work.
Description
For creating immutable types, msgspec already supports serializing/parsing: frozenset, tuple, dataclass(frozen=True).
The only primary data type missing here is a
dict
. The immutable variant of a dict is MappingProxyType.Thanks to the extensibility, it's possible today to create a custom encode/decoder for this type. However, it would be great to include this as a supported, builtin type