Open sostock opened 4 years ago
However, floating-point quantities might break this, since isequal does not necessarily convert its arguments to upreferred (it uses promote, which only falls back to upreferred for FreeUnits). I can think of two possible solutions to this:
That's a good point, but can we construct a realistic scenario where this consideration is a problem?
However, floating-point quantities might break this, since
isequal
does not necessarily convert its arguments toupreferred
(it usespromote
, which only falls back toupreferred
forFreeUnits
). I can think of two possible solutions to this:That's a good point, but can we construct a realistic scenario where this consideration is a problem?
Maybe the following: I can define a dictionary d = Dict{typeof(1.0u"cm"), Int}()
. Then I push a value which has a different unit: d[1u"inch"] = 1
. The key would get converted to u"cm"
. Now I might not be able to retrieve the value using d[1u"inch"]
because the key (after conversion) and 1u"inch"
have different hashes.
Edit: In this specific example (1.0u"cm"
and 1u"inch"
), the key actually cannot be inserted into the Dict
because convert(typeof(1.0u"cm"), 1u"inch")
is not equal to 1u"inch"
(setindex!
checks for that). But there could be another example where the quantities are equal but their hashes differ.
I can think of two possible solutions to this:
- Change
isequal
to always convert toupreferred
.- Use
TwicePrecision
(is it sufficient?).
Changing isequal
to always convert to upreferred
seems not too bad to me. I actually think there is no good reason not to do this. The point of ContextUnits
and FixedUnits
is to customize promotion behavior, but isequal
only uses the promoted value internally, so it should not matter.
The only problem that I see with it is that it is wasteful if both quantities already have the same unit. Maybe we don’t have to do the conversion in this case.
But I think I would also be in favor of using TwicePrecision
for all floating-point conversion factors, independently of the hashing issue, if the performance isn’t too bad.
Implementing such a hash method would potentially be very breaking (for example, right now one can have a Dict with both 1u"m" and 100u"cm" as keys), so it is something to consider for a 2.0 release.
I'd say that it's a bugfix if it's done...
The only problem that I see with it is that it is wasteful if both quantities already have the same unit. Maybe we don’t have to do the conversion in this case.
(edited because math) I agree that we wouldn't need the conversion for same-unit quantities, which makes it acceptable for performance IMO. I wonder if there are other issues.
~Wait; the requirement is that x == y
implies hash(x) == hash(y)
, but the converse need not be true. Thus, we don't need upreferred
in isequal
. We just need upreferred
in hash
, which seems entirely acceptable to me.~
EDIT: nevermind, with this proposal if x and y are of different units it's possible to have isequal(x, y) && hash(x) != hash(y)
.
the requirement is that
x == y
implieshash(x) == hash(y)
I thought that was isequal
, not ==
. They're different
A discussion about the implementation of
hash
was started in #378. While that issue has a narrower scope (hashing ofQuantity{BigFloat}
where theBigFloat
s have the same values (==
) but are not identical (===
)), hashing should generally be consistent withisequal
(see thehash
docstring), i.e.,hash(1u"m") == hash(100u"cm")
.Implementing such a
hash
method would potentially be very breaking (for example, right now one can have aDict
with both1u"m"
and100u"cm"
as keys), so it is something to consider for a 2.0 release.Some observations/thoughts related to such a
hash
implementation:hash
function, a quantity would have to be converted to a “standard unit” (probablyupreferred
). However, floating-point quantities might break this, sinceisequal
does not necessarily convert its arguments toupreferred
(it usespromote
, which only falls back toupreferred
forFreeUnits
). I can think of two possible solutions to this:isequal
to always convert toupreferred
.TwicePrecision
(is it sufficient?).Dates
stdlib already implements hashing like this:hash(Second(60)) == hash(Minute(1))
. However, they don’t have to deal with floating-point numbers, which makes it easier. If comparison betweenQuantity
s andPeriod
s is added to Unitful (see #331), hashing ofQuantity
s should also be consistent with hashing ofPeriod
s.