Fatal1ty / mashumaro

Fast and well tested serialization library
Apache License 2.0
761 stars 45 forks source link

Tuples are not hashable. #243

Closed victora-openai closed 1 week ago

victora-openai commented 1 week ago

Description

I tried to use tuples as dict keys and got an exception TypeError: unhashable type: 'list'

What I Did

from dataclasses import dataclass
from mashumaro.mixins.msgpack import DataClassMessagePackMixin

@dataclass
class Foo(DataClassMessagePackMixin):
    bar: dict[tuple[int, int], int]

f = Foo(bar={(0, 1): 1})
s = f.to_msgpack()  # TypeError: unhashable type: 'list'

I believe the tuple is being incorrectly converted to a list while the dataclass is being converted to a dict

Fatal1ty commented 1 week ago

I believe the tuple is being incorrectly converted to a list

Why incorrectly? What do you think it should be converted into by default?

victora-openai commented 1 week ago

If the tuples were just left as tuples then they would remain hashable.

victora-openai commented 1 week ago

Ah, I saw your other comment. I think there is some confusion. Tuples are hashable:

d = {}
d[(1,2)] = 3

is legal Python.

Fatal1ty commented 1 week ago

Ah, I saw your other comment. I think there is some confusion. Tuples are hashable:

d = {}
d[(1,2)] = 3

is legal Python.

Ok, it’s legal. And now how can this information help you encode the dataclass from your original example with MessagePack?

victora-openai commented 1 week ago

I may be naive about the problem but I was thinking that a dictionary could be serialized as a list (or tuple..) of key-value tuples. You could recursively export the serializable representations of the keys and values.

For example,

d = {(1, 2): 3, (4, 5): 6}  # Serializable as [((1, 2), 3), ((4, 5), 6)]

I imagine something similar is already happening today in the preprocessing step to serialize dictionaries. tried to take a look at the code but I couldn't weave through the autogenerated code to find where it was happening.

My guess is that the TypeError: unhashable type: 'list' error is happening because we were trying to serialize as {[1, 2]: 3, [4, 5]: 6} as part of the conversion to dict. Hopefully using tuple instead of converting to list keys should fix that.

Fatal1ty commented 1 week ago

I may be naive about the problem but I was thinking that a dictionary could be serialized as a list (or tuple..) of key-value tuples. You could recursively export the serializable representations of the keys and values.

For example,

d = {(1, 2): 3, (4, 5): 6}  # Serializable as [((1, 2), 3), ((4, 5), 6)]

I imagine something similar is already happening today in the preprocessing step to serialize dictionaries. tried to take a look at the code but I couldn't weave through the autogenerated code to find where it was happening.

My guess is that the TypeError: unhashable type: 'list' error is happening because we were trying to serialize as {[1, 2]: 3, [4, 5]: 6} as part of the conversion to dict. Hopefully using tuple instead of converting to list keys should fix that.

There is no such thing as tuple in MessagePack. If you’re going to have tuple keys, which is not a common case, you can use pass_through strategy.

Fatal1ty commented 1 week ago

I may be naive about the problem but I was thinking that a dictionary could be serialized as a list (or tuple..) of key-value tuples. You could recursively export the serializable representations of the keys and values.

You can do whatever transformation you like on your side but the default is to convert tuples to lists, since lists are supported by majority of serializarion formats. As always, the customization options can be found in the README.