uber / h3

Hexagonal hierarchical geospatial indexing system
https://h3geo.org
Apache License 2.0
4.85k stars 459 forks source link

Change float to double in vec2d #652

Closed nrabinowitz closed 2 years ago

nrabinowitz commented 2 years ago

Fixes #651

This changes float to double as suggested, and adds tests for cellsToLinkedMultiPolygon (which depends on geo coordinate precision) that confirm improved precision in the output.

nrabinowitz commented 2 years ago

This may address https://github.com/uber/h3-py/issues/277 - it's probably worth checking whether that case is now covered.

isaacbrodsky commented 2 years ago

Please add to the CHANGELOG describing this change. As I understand it, the impact is that cellToBoundary results for distortion vertexes were calculated at a lower precision (and cellToLinkedMultiPolygon and edge boundary calculations that use the same code path).

Please also add a test for one or more of the specific cells in the issue. While cellsToLinkedMultiPolygon exercises this today, it will not cover this in the future because the plan is to change that function to not do coordinate comparisons.

isaacbrodsky commented 2 years ago

It looks like the tests are failing since distortion vertexes will be calculated slightly differently due to the precision change.

nrabinowitz commented 2 years ago

I see a few things in the tests:

Does any of this raise concerns? I don't think we've promised that the geo boundary would remain stable within FPE, but it seems possible this might come across as a breaking change for some users.

nrabinowitz commented 2 years ago
  • I'm not sure how to fix the cell boundary tests without regenerating the test files with new random cells. That doesn't feel very kosher, but I suspect this will be tedious-to-impossible to do with edits.

Actually, testing the output here what I see are failures in b1->numVerts == b2.numVerts, expected cell boundary count, which I'm pretty sure is #45 again (this makes sense, we use the vector intersection to test whether we need another vertex). I'm guessing that the improved precision is breaking our fix for that issue.

coveralls commented 2 years ago

Coverage Status

Coverage increased (+0.0003%) to 99.038% when pulling 2539aa2893956314ae4fe44d1be038122ac15d98 on nrabinowitz:vec2d-double into db0a3e4bb47569ad81488498c2748fa1defec9b2 on uber:master.

nrabinowitz commented 2 years ago

Ok, I've fixed the tests:

Please also add a test for one or more of the specific cells in the issue. While cellsToLinkedMultiPolygon exercises this today, it will not cover this in the future because the plan is to change that function to not do coordinate comparisons.

The res 1 and res 3 cells in the issue should be covered by the regen'd test files - I don't think we need extra tests for this (since the issue was not that the vertices were wrong, but that we might want to use double in that spot).

Now the question is: Do we want to do this? On the one hand, I'd rather rely on a more precise calculation, and I feel weird about the idea that the canonical boundaries are defined by a bad implementation detail. On the other hand, I don't know the ramifications of changing the boundaries, which are sort of meant to be canonical and unchanging.

One argument in favor is that presumably the more correct boundaries would enclose some points that are indexed to the cell but would otherwise fall outside the boundary with a point-in-spherical-poly check.

nrabinowitz commented 2 years ago

So I tried for the better part of a day to construct a case where a point that indexed to the cell was not physically inside the cell boundary when the value was a float, but was inside the cell boundary when the value was double. My test case looks like this:

// For every vertex, take some points around the vertex. If they index
// to the cell, they should also be inside the physical boundary.
for (int i = 0; i < boundary.numVerts; i++) {
    for (int latOffset = -1; latOffset < 1; latOffset++) {
        for (int lngOffset = -1; lngOffset < 1; lngOffset++) {
            if (latOffset && lngOffset) {
                point.lat = boundary.verts[i].lat +
                            latOffset * FLT_EPSILON;
                point.lng = boundary.verts[i].lng +
                            lngOffset * FLT_EPSILON;

                H3Index cell2;
                t_assertSuccess(
                    H3_EXPORT(latLngToCell)(&point, 1, &cell2));
                if (cell2 == cell) {
                    // Check whether the point is physically inside
                    // the geo boundary
                    t_assert(
                        pointInsideGeoLoop(&geoloop, &bbox, &point),
                        "Boundary contains input point");
                }
            }
        }
    }
}

This has identified a number of cases with res 1 cells where the assertion fails, but in all cases it fails for both float and double 😭.

I've verified that the only vertexes this affects are distortion vertexes - all others are unchanged.

isaacbrodsky commented 2 years ago

@nrabinowitz Doesn't pointInsideGeoLoop assume cartesian geometry? Wouldn't the test need to be using spherical point in polygon?

nrabinowitz commented 2 years ago

@nrabinowitz Doesn't pointInsideGeoLoop assume cartesian geometry? Wouldn't the test need to be using spherical point in polygon?

nrabinowitz commented 2 years ago

OK, I found a single point (one of the float-based distortion vertexes) that fails my test with float but passes with double. Presumably there are more, but a longer search would be required. This PR should now be ready for re-review.