google / s2geometry

Computational geometry and spatial indexing on the sphere
http://s2geometry.io/
Apache License 2.0
2.29k stars 302 forks source link

S2RegionCoverer very high RAM consumption on a specific polygon #350

Closed TonyRoussel closed 7 months ago

TonyRoussel commented 7 months ago

Context

I'm trying to generate a coverage of a geojson polygon exclusively with level 13 cells, from s2geometry python binding. Here is a screenshot of said geojson: Screenshot_2024-02-05_15-04-12

Issue

When running the coverer GetCovering method, the code hang while filling the memory until it get kill by the system

How to reproduce

After compiling s2geometry (7773d51), the following code actually produce the issue (see attached file below for the geojson)

import json
import s2geometry as s2

def create_polygon_from_polygon_geometry(polygon_geometry):
    def verts2loop(vertices_lon_lat: list) -> s2.S2Loop:
        loop = s2.S2Loop()
        loop.Init(
            [
                s2.S2LatLng.FromDegrees(lon_lat[1], lon_lat[0]).ToPoint()
                for lon_lat in vertices_lon_lat[:-1]
            ]
        )
        #  we skip the last vertice which is assumed to be the same as the first
        loop.Normalize()
        return loop

    loops = []
    for line_coordinate in polygon_geometry["coordinates"]:
        loop = verts2loop(line_coordinate)
        loops.append(loop)

    if len(loops) == 1:
        polygon = s2.S2Polygon(loops[0])
    elif len(loops) > 1:
        polygon = s2.S2Polygon()
        polygon.InitNested(loops)
    else:
        raise RuntimeError("empty polygon")
    return polygon

def get_polygon_covering(s2polygon, s2_cell_level):
    coverer = s2.S2RegionCoverer()
    coverer.set_min_level(s2_cell_level)
    coverer.set_max_level(s2_cell_level)
    return coverer.GetCovering(s2polygon)

with open("investigate-6faces-coverer.json", "r") as f:
    geojon = json.load(f)
problematic_s2polygon = create_polygon_from_polygon_geometry(geojon["features"][0]["geometry"])
coverage = get_polygon_covering(problematic_s2polygon, 13)  # <-- this will eat your RAM

investigate-6faces-coverer.json

Additional information

With some advanced debugging (aka, printing info directly in the code), one thing I noticed and that look suspicious: In S2RegionCoverer::GetCoveringInternal, just after running S2RegionCoverer::GetInitialCandidates, pq_ is of size 6, which I think is the maximum possible (6 level 1 cell exist). While some other similar geojson that does not produce the issue have a initial pq_ size 2 at that point)

jmr commented 7 months ago

What's the underlying problem you're trying to solve? Usually getting a constant-level covering is only good for quick hacks, and a range of levels should be used.

How much memory do you expect it to take? What's the area of the polygon divided by the average level-13 cell area?

You might also try covering the rect bound of the polygon and seeing if that also runs out of memory.

TonyRoussel commented 7 months ago

What's the underlying problem you're trying to solve? Usually getting a constant-level covering is only good for quick hacks

this is a very small part of a more complex use case that I tried to isolate for bug report

How much memory do you expect it to take? What's the area of the polygon divided by the average level-13 cell area?

I expect close to nothing :) I actually do this for many other polygons, with comparable or bigger area, and it cost nothing.

The fact that the initial candidates are 6 cells (so I guess the entire planet) is really weird and I think this is the root problem

smcallis commented 7 months ago

My first guess would be that your polygon is inverted so it's effectively trying to cover most of the world with L13 cells.

On Mon, Feb 5, 2024 at 8:16 AM troussel @.***> wrote:

What's the underlying problem you're trying to solve? Usually getting a constant-level covering is only good for quick hacks

this is a very small part of a more complex use case that I tried to isolate for bug report

How much memory do you expect it to take? What's the area of the polygon divided by the average level-13 cell area?

I expect close to nothing :) I actually do this for many other polygons, with comparable or bigger area, and it cost nothing.

The fact that the initial candidates are 6 cells (so I guess the entire planet) is really weird and I think this is the root problem

— Reply to this email directly, view it on GitHub https://github.com/google/s2geometry/issues/350#issuecomment-1927231253, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEMKVV5L4XDXFXQEQGB43YSDZTNAVCNFSM6AAAAABC2IOXZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGIZTCMRVGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

TonyRoussel commented 7 months ago

My first guess would be that your polygon is inverted so it's effectively trying to cover most of the world with L13 cells. On Mon, Feb 5, 2024 at 8:16 AM troussel @.> wrote: What's the underlying problem you're trying to solve? Usually getting a constant-level covering is only good for quick hacks this is a very small part of a more complex use case that I tried to isolate for bug report How much memory do you expect it to take? What's the area of the polygon divided by the average level-13 cell area? I expect close to nothing :) I actually do this for many other polygons, with comparable or bigger area, and it cost nothing. The fact that the initial candidates are 6 cells (so I guess the entire planet) is really weird and I think this is the root problem — Reply to this email directly, view it on GitHub <#350 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEMKVV5L4XDXFXQEQGB43YSDZTNAVCNFSM6AAAAABC2IOXZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGIZTCMRVGM . You are receiving this because you are subscribed to this thread.Message ID: @.>

Thank you it look like this is it. I reversed the coordinates before initializing the loop, and indeed the issue disappear, and the general behavior of my system look sane

I was aware of this orientation issue, but since it works with every other examples I think I don't get the fundamental. By reversing the coordinates list before constructing the loop, I would expect that other cases where my code was working before, would suddenly be an issue because they would now cover the planet. But that's not the case so I feel clueless right now.

Anyway thanks !

jmr commented 7 months ago

Either looking at the bounding box or the area would have shown that the loop was inverted / something had gone wrong with polygon construction.

I'm a bit confused how the loop could be inverted, since loop.Normalize() is called.

TonyRoussel commented 7 months ago

Either looking at the bounding box or the area would have shown that the loop was inverted / something had gone wrong with polygon construction.

I'm a bit confused how the loop could be inverted, since loop.Normalize() is called.

yes indeed thanks. In the end I changed the intermediate step to construct polygon, so I can use the "smallest" loop based on their bounding box