python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.48k stars 2.83k forks source link

`dataclass` is not `Hashable` #11463

Open KotlinIsland opened 3 years ago

KotlinIsland commented 3 years ago
from dataclasses import dataclass
from typing import Hashable

@dataclass
class SusDataclass:
    b = "AMONGUS"

h: Hashable = SusDataclass()  # SUS ALERT!

d = {
    SusDataclass(): 1  # SUS ALERT! fails at runtime
}
sobolevn commented 3 years ago

This happens because unsafe_hash is not taken into an account in plugins/dataclasses.py: https://github.com/python/mypy/blob/2db0511451ecf511f00dbfc7515cb6fc0d25119c/mypy/plugins/dataclasses.py#L123-L128

This should still be Hashable:

from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class Some:
    x: int

print(hash(Some(1)))  # ok

I will send a fix today. Thanks for the report! 👍

sobolevn commented 3 years ago

By the way, here's a note from the docs:

If hash() is not explicitly defined, or if it is set to None, then dataclass() may add an implicit hash() method. Although not recommended, you can force dataclass() to create a hash() method with unsafe_hash=True. This might be the case if your class is logically immutable but can nonetheless be mutated. This is a specialized use case and should be considered carefully.

https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass

So, it still might be hashable even without unsafe_hash. I will research it further.

sobolevn commented 3 years ago

unsafe_hash: If False (the default), a hash() method is generated according to how eq and frozen are set.

# __hash__

#    +------------------- unsafe_hash= parameter
#    |       +----------- eq= parameter
#    |       |       +--- frozen= parameter
#    |       |       |
#    v       v       v    |        |        |
#                         |   no   |  yes   |  <--- class has explicitly defined __hash__
# +=======+=======+=======+========+========+
# | False | False | False |        |        | No __eq__, use the base class __hash__
# +-------+-------+-------+--------+--------+
# | False | False | True  |        |        | No __eq__, use the base class __hash__
# +-------+-------+-------+--------+--------+
# | False | True  | False | None   |        | <-- the default, not hashable
# +-------+-------+-------+--------+--------+
# | False | True  | True  | add    |        | Frozen, so hashable, allows override
# +-------+-------+-------+--------+--------+
# | True  | False | False | add    | raise  | Has no __eq__, but hashable
# +-------+-------+-------+--------+--------+
# | True  | False | True  | add    | raise  | Has no __eq__, but hashable
# +-------+-------+-------+--------+--------+
# | True  | True  | False | add    | raise  | Not frozen, but hashable
# +-------+-------+-------+--------+--------+
# | True  | True  | True  | add    | raise  | Frozen, so hashable
# +=======+=======+=======+========+========+
# For boxes that are blank, __hash__ is untouched and therefore
# inherited from the base class.  If the base is object, then
# id-based hashing is used.
#
# Note that a class may already have __hash__=None if it specified an
# __eq__ method in the class body (not one that was created by
# @dataclass).
#
# See _hash_action (below) for a coded version of this table.

https://github.com/python/cpython/blob/e9594f6747eaaaa848c26e2bf67d467aabfd62b3/Lib/dataclasses.py#L121

sobolevn commented 3 years ago

I will send a PR after https://github.com/python/mypy/pull/11483 is merged. I don't want to solve merge conflicts.

tmke8 commented 3 years ago

Aren't the docs quite clear about this?

By default, dataclass() will not implicitly add a __hash__() method unless it is safe to do so. [...] Here are the rules governing implicit creation of a __hash__() method. [...] If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, __hash__() will be left untouched meaning the __hash__() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).