Closed stevengj closed 9 years ago
Note that the Package Status already automatically filed issues with packages affected and some of us have already committed changes (though I haven't officially bumped the version yet thankfully). It'd be great to have a quick decision on this so we can all make the needed changes or revert if needed.
reference: https://github.com/karbarcca/Datetime.jl/issues/47
I wrote that three years ago. What changed:
==
is no longer vectorized. We decided ==
needs to be a legitimate equality relation.==
used to be very different from isequal
due to the difficulty of hashing numbers, which Stefan has now fixed.Arguably isequal
is still more fundamental in that it is consistent with ===
(===
implies isequal
). However, in practice this only affects NaN
. It doesn't seem worth bothering everybody with isequal
just because of NaN
. We are now in a position where in nearly all cases one can just pretend there are two equality functions, ==
and ===
.
@karbarcca, Package Status only filed issues if this caused a test failure. Overriding isequal
instead of ==
could easily cause more subtle bugs that don't show up in test suites.
Also I admit there is some confusion due to the apparent parallel between isequal
and isless
, but these functions have little to do with each other:
NaN
.==
and isequal
. You can't work around an incorrect ordering by defining isless
, since in those cases it should remain undefined. You could define an isless
that throws an error, but this is not as satisfying.@JeffBezanson, I suppose I'm fine with the change if this is really what we want. I was just surprised that a breaking change didn't seem to have been discussed anywhere. (The hashing discussions never seemed to have discussed whether ==
should call isequal
or vice versa.)
But for the next year or so (as long as people care about 0.2 compatibility), it seems like dozens of packages will need to override both ==
and isequal
. This should be prominently explained in the NEWS at the very least.
I think a post in julia-dev would be great to help explain what exactly packages need to implement for 0.2 & 0.3 to work smoothly (plus the isless, isequal distinction), for those of us who obviously didn't understand what was required here....:)
Note that the manual currently provides no (or misleading) guidance on this issue.
(It still says that isequal(x,y)
must imply hash(x) == hash(y)
, which was a sensible enough rule.)
What is the downside of continuing with the old behavior (==
calls isequal
, hence types should override isequal
)?
Just that you want to hide isequal
?
Isn't it easier to remember that you should provide isequal
and isless
methods if you want to use your own comparison predicates for collections? (As opposed to the rule: override isless
and ==
, but also provide isequal
if you care about Julia 0.2.)
The help for ==
and isequal
has been carefully updated and is entirely correct. Hashing still uses isequal, but isequal defaults to ==
.
Lots of other functions use ==
. e.g. findnext
uses ==
. So if you only override isequal
(Julia 0.2 behavior) then findnext
will break in 0.3. Also !=
and so on.
The point is that if you have a new type, in the common case you only want to define a single equality predicate and have it work for comparisons everywhere. Why not isequal
?
Also I suspect there is a lot of code that (previously, incorrectly) defines ==
just because it is so intuitive. Generally using only == will be easier for everybody in the future. Unfortunately package authors who did the right thing have some pain in the short term.
Here is a list of packages that override ==
but not isequal
:
CRC
Devectorize
FixedPoint
Graphs
Lazy
Polynomial
RandomMatrices
SIUnits
SymPy
Unfortunately, all authors are going to have some pain in the short term if they want to work well with both 0.2 and 0.3.
Is there some way we can issue a warning in 0.3 if code overrides isequal
but not ==
?
The thing is that it is acceptable to override isequal if you are defining a new floating-point type or a collection type, so it wouldn't be correct. I will, before 0.3 write up a new manual chapter that explains everything about how equality and hashing should work for user-defined types (and built-in ones too).
If you go here http://iainnz.github.io/packages.julialang.org/ and click "package statistics for nightly julia" you can see when everything broke. Mostly fixed up now - provided there were tests that caught it.
(@karbarcca I've actually been filing issues on package status change manually, auto-issues still a TODO)
@IainNZ, it looks like tests caught the problem for less than half of the likely affected packages listed above. What scares me about this change is how silent much of the breakage is likely to be.
I really think there's no alternative to just going through the 2 dozen or so affected packages manually. I'm also worried that people will "fix" the problem by breaking compatibility with 0.2 (often without breaking any tests, it is just that certain operations will be subtly wrong). The bug reports filed against the packages need to emphasize that people should implement both isequal
and ==
.
Sorry about how this ended up getting into master – I wanted to get this merged but I was on vacation, so Jeff went ahead and did it while I was gone. As a result, there hasn't been very good communication about it.
Okay... feel free to close this when the NEWS and manual are updated.
Lest people get the wrong idea, I didn't merge this behind your back. Before you left we specifically agreed that I would merge it.
Yes, absolutely – I wanted it merged, and I'm glad you did it.
FWIW, I think this change is pretty much addressed by packages now, check out the graph on http://iainnz.github.io/packages.julialang.org/ under "Package statistics for nightly Julia..." (all time high of passing packages, I guess filing 100+ issues was worth it :D)
@IainNZ, I think you're too optimistic about the test coverage. There are a dozen open issues above from packages that I went through manually and seemed to be overloading the wrong thing.
Is there anything to be done in Base julia for this, aside from documentation?
No.
Assigning @StefanKarpinski here to do some docs updates and clear out this 0.3 issue.
@StefanKarpinski bump. this seems really trivial for still being in the 0.3-blocking list
This is already superficially documented. What's missing is a chapter on equality and hashing, which is not quite trivial. I think it should just not block 0.3 but should happen soon after.
How hard is it to change the redirect on docs.julialang.org to always point to latest? Otherwise the stable docs will be up for a while and what most people see by default.
This has dragged on long enough; I felt like it was sufficient to clarify this relationship in each of the manual sections that discusses isequal
and hash
.
backported in 1f68fa810db2f1a4eda17c2ccde8489b1a46863c
It used to be that
==
calledisequal
, so that packages wanting to override equality for their own types would overrideisequal
. However, commit 737ad6eba2f94a8542e9013fa84b6f4e3f5d827a silently changed this to makeisequal
call==
.This potentially results in subtle breakage (e.g. stevengj/PyPlot.jl#52) of a large number of packages that override
isequal
:Unfortunately, the fix is not very clean either: in order to remain compatible with 0.2, packages will now have to override both
isequal
and==
.I think it's clear that we need to discuss this issue a bit more thoroughly before we release this change in 0.3, hence I'm marking it as a milestone issue.
(Apologies if this was already discussed, but I'm not seeing it anywhere right now. e.g. @JeffBezanson wrote in #90: "I've come to the conclusion that
isequal
is the more fundamental predicate [...]==
then defaults toisequal
". What has changed?)cc: @StefanKarpinski, @jakebolewski