locationtech / jts

The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Other
1.94k stars 441 forks source link

Hash code collisions for a family of points #1065

Open dbotija-RatedPower opened 1 month ago

dbotija-RatedPower commented 1 month ago

I found the following hash collisions between different points:

@Test
void hashCollision() {
  double x = 100000.4;
  double y = 4126380.6868;
  GeometryFactory factory = new GeometryFactory(new PrecisionModel());

  Point point1 = factory.createPoint(new Coordinate(x, y));
  Point point2 = factory.createPoint(new Coordinate(x, y + 1));

  int hashPoint1 = point1.hashCode();
  int hashPoint2 = point2.hashCode();

  Assertions.assertThat(hashPoint1).isNotEqualTo(hashPoint2);
}

This test fails because both hashes are -675137989. The same happens using points with y + 2 and y + 3, or y + 4 and y + 5 and so on. Also it doesn't seem to matter which x value is used, so it still fails when changing it to any other value.

It is not technically wrong as different objects can have the same hash code, but it may cause bad performance in very specific cases. I tried to understand a bit why it may be happening and I saw that the difference between the hashes of the y coordinate and y + 1 is exactly 2^31 so that must be causing that at the end the final hash is the same when a modulo is applied.

mprins commented 1 month ago

please elaborate as GeomPoint is not a class that exists in JTS

dbotija-RatedPower commented 1 month ago

Sorry I inadvertently used my wrapper class, I'll update the test. It's the same issue as I use JTS's hash code.