Closed Icxolu closed 4 months ago
Actually, one thought - should we make this only available with #[pyclass(frozen)]
? Otherwise I can foresee accidents where users mutate data and corrupt their hash maps.
(If they really want a __hash__
on a mutable pyclass, they then just add it manually the old way.)
Actually, one thought - should we make this only available with
#[pyclass(frozen)]
? Otherwise I can foresee accidents where users mutate data and corrupt their hash maps.
Given that it can be implemented trivially if needed, I think that makes sense to take the safe route here đź‘Ť.
On that note, should we also require #[pyclass(eq)]
(once we have that) or is that not needed?
On that note, should we also require
#[pyclass(eq)]
(once we have that) or is that not needed?
Great question. So, quoting the Python docs:
If a class does not define an eq() method it should not define a hash() operation either
I guess that wording is strong enough that we should indeed be encouraging users to define eq
if they want hash
? So perhaps yes, we should require eq
, especially given that this is designed to be a convenience for correct implementations and the #[pymethods]
escape hatch for weird cases will remain.
I'm happy to either block this PR on landing eq
first or proceed with this PR and add a note in #4207 that we should require this later.
I think I might just implement the eq
option, so we can merge this in the proper order.
Rebased. hash
now requires eq
in addition to frozen
. I've also changed the generated slot the same way as in eq
, so that manual __hash__
with the hash
option errors.
This option is only available for
frozen
classes to prevent accidental hash changes from mutating the object. If you need an__hash__
implementation for a mutable class, use the manual method from above.
I don't get it, how accidental ??? If I mutate an object I expect the hash change. This seriously limit the use of this automatic derive.
Mutating an object hash in Python is usually a mistake - e.g. because it breaks dictionaries. From the Python docs:
If a class does not define an eq() method it should not define a hash() operation either; if it defines eq() but not hash(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an eq() method, it should not implement hash(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
To add to that, in Rust you can't easily mutate an object while it's in a HashMap (only via interior mutability or unsafe) since it can control that you never get a &mut reference to it.
Python can't offer such a guarantee, and therefore only lets you use immutable objects as keys or set items by default. Of course you can implement your own hash and wreak havoc.
Yes of course like in Rust key inside a hashmap should not be modified in a way that change the hash but that a solf error from the user, it's NOT enforced at compile time even in Rust. Well I don't know exactly how python works, but this mean I can't use this derive if I have a rust type that CAN be mutable. My use case is pretty simple I want my Rust struct to be hashable, and I have serialization and deserialisation feature that somehow need to construct dummy value and then set state so they need to be mutable... I blame python pickle for asking user non sense. Anyway, the only requirement is the user should not mutate the key of a hashmap (like in rust) and that for me obvious, but require this at compile time is very annoying making this derive not very useful.
If pickle is the reason why your class needs to be mutable, you should be able to use the __getnewargs__
protocol which will not set state after the object is constructed.
it's impossible my object can't be construct without specific parameter that I don't have access anymore after and this doesn't change my main argument, any non trivial struct that could be mutable will not be able to use this derive. Please ignore my specific usecase and just accept this struct also have mutable feature that user will not use when the struct is in a dict.
For more
dataclass
like usage and similar to #4202, this adds ahash
pyclass option to implement the__hash__
slot using the type'sHash
implementation.