locationtech / jts

The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Other
1.99k stars 443 forks source link

Hash code collisions for a family of points #1065

Open dbotija-RatedPower opened 4 months ago

dbotija-RatedPower commented 4 months ago

I found the following hash collisions between different points:

@Test
void hashCollision() {
  double x = 100000.4;
  double y = 4126380.6868;
  GeometryFactory factory = new GeometryFactory(new PrecisionModel());

  Point point1 = factory.createPoint(new Coordinate(x, y));
  Point point2 = factory.createPoint(new Coordinate(x, y + 1));

  int hashPoint1 = point1.hashCode();
  int hashPoint2 = point2.hashCode();

  Assertions.assertThat(hashPoint1).isNotEqualTo(hashPoint2);
}

This test fails because both hashes are -675137989. The same happens using points with y + 2 and y + 3, or y + 4 and y + 5 and so on. Also it doesn't seem to matter which x value is used, so it still fails when changing it to any other value.

It is not technically wrong as different objects can have the same hash code, but it may cause bad performance in very specific cases. I tried to understand a bit why it may be happening and I saw that the difference between the hashes of the y coordinate and y + 1 is exactly 2^31 so that must be causing that at the end the final hash is the same when a modulo is applied.

mprins commented 4 months ago

please elaborate as GeomPoint is not a class that exists in JTS

dbotija-RatedPower commented 4 months ago

Sorry I inadvertently used my wrapper class, I'll update the test. It's the same issue as I use JTS's hash code.