explosion / srsly

🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
MIT License
435 stars 31 forks source link

Fix Typing for `JSONOutput` and `JSONOutputBin` #80

Closed rmitsch closed 1 year ago

rmitsch commented 1 year ago

Description

Add Tuple[Any, ...] for JSONOutput and JSONOutputBin. Otherwise type hinting will indicate mismatching types if tuples are an expected type, e.g. for values: Tuple[Any, ...] = srsly.load(value).

Also fixes CI image for Linux with Python 3.6.

Related to https://github.com/explosion/srsly/pull/79.

Types of change

Bug fix.

Checklist

rmitsch commented 1 year ago

Is there actually a reason for the JSONInput/JSONInputBin and JSONOutput/JSONOutputBin types not matching? It's (including this PR)

JSONOutput = Union[str, int, float, bool, None, Dict[str, Any], List[Any], Tuple[Any, ...]]
JSONOutputBin = Union[bytes, str, int, float, bool, None, Dict[str, Any], List[Any], Tuple[Any, ...]]

vs.

JSONInput = Union[str, int, float, bool, None, Dict[str, Any], List[Any], Tuple[Any, ...], OrderedDict]
JSONInputBin = Union[bytes, str, int, float, bool, None, Dict[str, Any], List[Any], Tuple[Any, ...], OrderedDict]

- i.e. is it intentional that we have OrderedDict only for the input side of things?

polm commented 1 year ago

I had to look at these types, since if I've ever interacted with them it's been a while. Given that, oen thing that I would not have remembered off the top of my head is that they're also used for the messagepack deserialization. Keeping that in mind...

How can you get tuples from deserializing JSON? I thought maybe messagepack made it possible, but it seems that's not the case. Did you have code where you were expecting tuples from JSON?

Similarly, as far as I'm aware the JSON decoder won't give OrderedDicts by default, even with messagepack. You can make it work that way, but we haven't done that. So it seems reasonable that output wouldn't include OrderedDicts.

rmitsch commented 1 year ago

Hm, seems you are correct. A minimal version of my usecase is:

deserialized = srsly.json_loads(srsly.json_dumps(data=(1, 2, 3)))
print(deserialized)
print(type(deserialized))

I initially though deserialized was a tuple - hence the PR - but it's returned as a list. Treating it as a tuple in my code just happened to work because it's a list. Thanks for checking OrderedDict!

I'd close this PR, if you don't object?

rmitsch commented 1 year ago

(Or we change this PR to only include the CI update, which should be done anyway)

polm commented 1 year ago

It's fine to close this. I would say do the CI update as a separate tiny PR, just so it's not mixed in here.