PainterQubits / Unitful.jl

Physical quantities with arbitrary units
Other
613 stars 112 forks source link

Hashing of `Quantity`s #379

Open sostock opened 4 years ago

sostock commented 4 years ago

A discussion about the implementation of hash was started in #378. While that issue has a narrower scope (hashing of Quantity{BigFloat} where the BigFloats have the same values (==) but are not identical (===)), hashing should generally be consistent with isequal (see the hash docstring), i.e., hash(1u"m") == hash(100u"cm").

Implementing such a hash method would potentially be very breaking (for example, right now one can have a Dict with both 1u"m" and 100u"cm" as keys), so it is something to consider for a 2.0 release.

Some observations/thoughts related to such a hash implementation:

cstjean commented 4 years ago

However, floating-point quantities might break this, since isequal does not necessarily convert its arguments to upreferred (it uses promote, which only falls back to upreferred for FreeUnits). I can think of two possible solutions to this:

That's a good point, but can we construct a realistic scenario where this consideration is a problem?

sostock commented 4 years ago

However, floating-point quantities might break this, since isequal does not necessarily convert its arguments to upreferred (it uses promote, which only falls back to upreferred for FreeUnits). I can think of two possible solutions to this:

That's a good point, but can we construct a realistic scenario where this consideration is a problem?

Maybe the following: I can define a dictionary d = Dict{typeof(1.0u"cm"), Int}(). Then I push a value which has a different unit: d[1u"inch"] = 1. The key would get converted to u"cm". Now I might not be able to retrieve the value using d[1u"inch"] because the key (after conversion) and 1u"inch" have different hashes.

Edit: In this specific example (1.0u"cm" and 1u"inch"), the key actually cannot be inserted into the Dict because convert(typeof(1.0u"cm"), 1u"inch") is not equal to 1u"inch" (setindex! checks for that). But there could be another example where the quantities are equal but their hashes differ.

sostock commented 4 years ago

I can think of two possible solutions to this:

  • Change isequal to always convert to upreferred.
  • Use TwicePrecision (is it sufficient?).

Changing isequal to always convert to upreferred seems not too bad to me. I actually think there is no good reason not to do this. The point of ContextUnits and FixedUnits is to customize promotion behavior, but isequal only uses the promoted value internally, so it should not matter.

The only problem that I see with it is that it is wasteful if both quantities already have the same unit. Maybe we don’t have to do the conversion in this case.

But I think I would also be in favor of using TwicePrecision for all floating-point conversion factors, independently of the hashing issue, if the performance isn’t too bad.

cstjean commented 4 years ago

Implementing such a hash method would potentially be very breaking (for example, right now one can have a Dict with both 1u"m" and 100u"cm" as keys), so it is something to consider for a 2.0 release.

I'd say that it's a bugfix if it's done...

The only problem that I see with it is that it is wasteful if both quantities already have the same unit. Maybe we don’t have to do the conversion in this case.

(edited because math) I agree that we wouldn't need the conversion for same-unit quantities, which makes it acceptable for performance IMO. I wonder if there are other issues.

cstjean commented 4 years ago

~Wait; the requirement is that x == y implies hash(x) == hash(y), but the converse need not be true. Thus, we don't need upreferred in isequal. We just need upreferred in hash, which seems entirely acceptable to me.~

EDIT: nevermind, with this proposal if x and y are of different units it's possible to have isequal(x, y) && hash(x) != hash(y).

giordano commented 4 years ago

the requirement is that x == y implies hash(x) == hash(y)

I thought that was isequal, not ==. They're different