my2iu / Jinq

LINQ-style queries for Java 8
Other
659 stars 71 forks source link

Fixing unchecked null condition in equals method #69

Closed alainlompo closed 6 years ago

alainlompo commented 6 years ago

Adding a hashCode method

hashCode() implementation based on This post

my2iu commented 6 years ago

Thanks for the pull request. It's not really necessary because instance of this object will never be compared to null, and I purposely didn't define a hashCode() because it sometimes makes things harder to maintain, so I'll have to think about it.

Maaartinus commented 6 years ago

@my2iu I'm afraid, I must disagree. Even as the class maintaner and possibly the only direct user, you can't know what comes in a few years. The missing hashCode means that the class is broken and finding the bug may take a lot of time. Even a hashCode always returning zero is much better (it may cost some performance, but it's correct). Forgetting to use a field in the calculation is no problem, so you don't need to worry about maintenance.

my2iu commented 6 years ago

That's sort of why I didn't define a hashCode() on any of the classes of the entire Expression hierarchy. Expressions are currently only safe to use in HashMap and HashSet when using identity for equality. I can quickly check that by looking at any of the classes in the Expression subhierarchy to see if they have a hashCode() defined or not. Defining a hashCode() on a single class of the entire expression hierarchy just causes confusion. It might make sense to define hashCode() on all the classes of the Expression hierarchy, but doing it on a single class makes no sense.

In my opinion, having incorrect or incomplete hashCode() implementations are worse than not having one. So returning 0 in a hashCode() is unexpected and can cause insidious performance bugs. If there is a hashCode(), I expect it to be implemented correctly. I don't feel like taking on the maintenance burden of maintaining the correct hash every time I work on the Expression hierarchy, and I don't need it, so I'd rather not have it there where it could potentially confuse others.

I suppose there could be some disagreement on whether having equals() defined but not hashCode() is more broken than having unmaintained hashCode()s. But for me, I would consider the latter to be much worse.

So, I'll be closing this pull request.

alainlompo commented 6 years ago

Hi all, Thank you for the feedback. As an experienced former .Net developper, I took an interest in this project and the idea of a porting of ling to java is really great. I agree with @Maaartinus that hashCode() should always be defined along with equals(), but you have clearly explained @my2iu that maintenance could be an issue here. The issue I was trying to fix is described as squid:2259 and classified as a major bug. So in my opinion it would make sense to have at this point a version where the issue is completely fixed, maybe on a branch from that would derive from the current master, fix the issue there (on all Expression classes), tag it and redirect users who are interested in such an implementation to it (via a comment in your readme.md) for example.

lukaseder commented 6 years ago

Just lurking, don't have much of a stake here, but the equals() / hashCode() contract must be implemented correctly. The wording on Object.hashCode() leaves no doubt:

If two objects are equal according to the {@code equals(Object)} method, then calling the {@code hashCode} method on each of the two objects must produce the same integer result.

Do note, if you're interested in putting objects in maps / sets based on identity comparisons in some internal code, there's the IdentityHashMap class for this purpose.

But I agree that this should be fixed for all types in the hierarchy, not just for one.

alainlompo commented 6 years ago

@lukaseder I agree, and I would have gladly fixed it for all types in the hierarchy. Interested users could then build from there if maintenance is two costly on the main (master) branch

Maaartinus commented 6 years ago

So returning 0 in a hashCode() is unexpected and can cause insidious performance bugs.

I agree that these performance bugs may be about as bad as bugs caused by broken hashCode. There's one more solution: Define hashCode throwing an exception in the top level Expression class. That's also a contract violation, but it's pretty guaranteed to lead to no hard to find bugs.