Open laurikoobas opened 6 years ago
@laurikoobas how big a cluster are you using? and what is the node configuration? if you can share the polygon dataset it would be easier to debug this.. otherwise one thing you can do is collect a heap dump during the execution and send it over
Running it as an AWS Glue Job on 40 DPUs. It makes sense that the polygon dataset is the cause of this, but I can't share it. What would be something in the polygons that would make the index use an issue though?
I'm not familiar with Glue, but I think the amount of memory you need for these polygons might be tipping you over the 5GB limit you have set for the YARN job... what index precision are you using?
Used just the 30 that's in the example. Do you have guidelines or documentation on what it means and which values make sense for which use cases?
You want to pick a precision that can eliminate a large fraction of polygons..eg if your polygons are US states and you pick say precision of 10/15 each polygon roughly falls into O(1) grids at that precision
If you pick precision 30 that still holds true but we not spend more time computing the grids that overlap with the polygon and more space storing those grids since there will be a lot more of them now Each time you subdivide you get 4x more grids so if you pick too fine a precision you will pay for it in storage and time
precision is nothing but the geohash precision https://gis.stackexchange.com/questions/115280/what-is-the-precision-of-a-geohash
instead of characters, we are using the bit size (so to convert to geohash character length simply divide by 5). eg, precision of 35 = 7 character geohash
My code was successfully running with 350 million points and 300 polygons. Now the number of polygons went up to 450 and it started crashing. I did some tests and it still crashes with 10 points (not 10 million, just 10) and those 450 polygons. It's still fine if I limit the number of polygons to 300 though.
Right now I just disabled the index use, but I'd like to get to the root of the issue. Could the problem be in a weird polygon? The largest polygon we have has 174 points.
During my tests, these were some of the error messages: