uber / h3

Hexagonal hierarchical geospatial indexing system
https://h3geo.org
Apache License 2.0
4.95k stars 470 forks source link

Scala: H3 Indexes - multiPolygonToH3 function #356

Closed jonahnio closed 4 years ago

jonahnio commented 4 years ago

I’ve been doing this tutorial: https://databricks.com/notebooks/geomesa-h3-notebook.html

The polygon function (polygonToH3) works perfectly but however the multi polygon function is giving strange results. expecting millions of H3 indexes but generated only a few hundred indexes for an area of at least 10,000 km2 Has anyone faced similar issue?

val multiPolygonToH3 = udf{ (geometry: Geometry, resolution: Int) => var points: List[GeoCoord] = List() var holes: List[java.util.List[GeoCoord]] = List() if (geometry.getGeometryType == "MultiPolygon") { val numGeometries = geometry.getNumGeometries() if (numGeometries > 0) { points = List( geometry .getGeometryN(0) .getCoordinates() .toList .map(coord => new GeoCoord(coord.y, coord.x)): * ) } if (numGeometries > 1) { holes = (1 to (numGeometries - 1)).toList.map(n => { List( geometry .getGeometryN(n) .getCoordinates() .toList .map(coord => new GeoCoord(coord.y, coord.x)): *).asJava }) } } H3.instance.polyfill(points, holes.asJava, resolution).toList }

dfellis commented 4 years ago

I am unfamiliar with that notebook, but my immediate question is: what are you setting resolution to?

jonahnio commented 4 years ago

Hi David,

Thanks for the reply. The resolution = 12 thus I’m expecting millions of indexes

Regards

Jonah Nio Senior Data Insights Analyst D: +61 3 8601 4678 E: JonahN@tca.gov.au Transport Certification Australia ​Level 6, 333 Queen Street, Melbourne VIC 3000 ​T: + 61 3 8601 4600 Want to stay connected and get updates from TCA? Subscribe with us, email tca@tca.gov.au TCA Disclaimer: This email and any files transmitted with the email may be confidential to the intended recipient and may also be privileged. If you are not the intended recipient, any use, disclosure or reproduction of this email or file is unauthorised and any record of this email should be deleted. While we use antivirus software to alert us to the presence of computer viruses, we cannot guarantee that this email and any ​attachment are free from them. All liability is disclaimed for any virus, defect, error or interference caused or communicated by this email or any file and any resulting loss, cost or damage. From: David Ellis notifications@github.com Sent: Sunday, 7 June 2020 11:23 AM To: uber/h3 h3@noreply.github.com Cc: Jonah Nio JonahN@tca.gov.au; Author author@noreply.github.com Subject: Re: [uber/h3] Scala: H3 Indexes - multiPolygonToH3 function (#356)

I am unfamiliar with that notebook, but my immediate question is: what are you setting resolution to?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/uber/h3/issues/356#issuecomment-640141782, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALYDSPERLEL7CP3XQMKMYQ3RVLTYLANCNFSM4NWSN7IQ.

dfellis commented 4 years ago

Res 12 is very fine grained. I wonder if what you're actually seeing is some sort of silent failure inside of Spark? As you said, there should be millions of records, and the current polyfill algorithm is very conservative in its initial estimation for memory allocation to make sure there isn't an overflow -- but if it literally couldn't allocate enough memory, I wonder what all of those layers on top of H3 will do with that?

Manually editing this tutorial shows you get over 130k hexagons at res 12 in just ~4km radius:

Screenshot from 2020-06-06 19-01-44

There are unfortunately so many layers in the tutorial above H3 that I can't really say with any certainty where the issue is arising. I suspect an OOM happened and something "ate" that error, but I don't know for sure. Have you tried taking your GeoJSON and and feeding it more directly at H3 without the layers in between? That might be more illuminating.

jonahnio commented 4 years ago

Hi David,

Thanks so much for your input and for the useful link. You’re right it could be due to insufficient memory allocation for the job. It’s such a heavy computation. However the polygon works as shown below:

[cid:image001.png@01D63D07.CAD4D6E0]

It converted a polygon of about 1800 km2 in area to almost 6M h3 indexes as shown above.

Anyway I’ll try to reduce the number of points in the multi polygons and see if it works.

Regards

Jonah Nio Senior Data Insights Analyst D: +61 3 8601 4678 E: JonahN@tca.gov.au Transport Certification Australia ​Level 6, 333 Queen Street, Melbourne VIC 3000 ​T: + 61 3 8601 4600 Want to stay connected and get updates from TCA? Subscribe with us, email tca@tca.gov.au TCA Disclaimer: This email and any files transmitted with the email may be confidential to the intended recipient and may also be privileged. If you are not the intended recipient, any use, disclosure or reproduction of this email or file is unauthorised and any record of this email should be deleted. While we use antivirus software to alert us to the presence of computer viruses, we cannot guarantee that this email and any ​attachment are free from them. All liability is disclaimed for any virus, defect, error or interference caused or communicated by this email or any file and any resulting loss, cost or damage. From: David Ellis notifications@github.com Sent: Sunday, 7 June 2020 12:05 PM To: uber/h3 h3@noreply.github.com Cc: Jonah Nio JonahN@tca.gov.au; Author author@noreply.github.com Subject: Re: [uber/h3] Scala: H3 Indexes - multiPolygonToH3 function (#356)

Res 12 is very fine grained. I wonder if what you're actually seeing is some sort of silent failure inside of Spark? As you said, there should be millions of records, and the current polyfill algorithm is very conservative in its initial estimation for memory allocation to make sure there isn't an overflow -- but if it literally couldn't allocate enough memory, I wonder what all of those layers on top of H3 will do with that?

Manually editing this tutorialhttps://observablehq.com/@nrabinowitz/h3-radius-lookup?collection=@nrabinowitz/h3-tutorial shows you get over 130k hexagons at res 12 in just ~4km radius:

[Screenshot from 2020-06-06 19-01-44]https://user-images.githubusercontent.com/765551/83958590-5ff0db80-a828-11ea-86e9-ca41e0c528f2.png

There are unfortunately so many layers in the tutorial above H3 that I can't really say with any certainty where the issue is arising. I suspect an OOM happened and something "ate" that error, but I don't know for sure. Have you tried taking your GeoJSON and and feeding it more directly at H3 without the layers in between? That might be more illuminating.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/uber/h3/issues/356#issuecomment-640144938, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALYDSPHVG36XW76YAW3XH5LRVLYVVANCNFSM4NWSN7IQ.

dfellis commented 4 years ago

Hi David, Thanks so much for your input and for the useful link. You’re right it could be due to insufficient memory allocation for the job. It’s such a heavy computation. However the polygon works as shown below: [cid:image001.png@01D63D07.CAD4D6E0] It converted a polygon of about 1800 km2 in area to almost 6M h3 indexes as shown above. Anyway I’ll try to reduce the number of points in the multi polygons and see if it works. Regards Jonah Nio Senior Data Insights Analyst D: +61 3 8601 4678 E: JonahN@tca.gov.au Transport Certification Australia ​Level 6, 333 Queen Street, Melbourne VIC 3000 ​T: + 61 3 8601 4600 Want to stay connected and get updates from TCA? Subscribe with us, email tca@tca.gov.au TCA Disclaimer: This email and any files transmitted with the email may be confidential to the intended recipient and may also be privileged. If you are not the intended recipient, any use, disclosure or reproduction of this email or file is unauthorised and any record of this email should be deleted. While we use antivirus software to alert us to the presence of computer viruses, we cannot guarantee that this email and any ​attachment are free from them. All liability is disclaimed for any virus, defect, error or interference caused or communicated by this email or any file and any resulting loss, cost or damage. From: David Ellis notifications@github.com Sent: Sunday, 7 June 2020 12:05 PM To: uber/h3 h3@noreply.github.com Cc: Jonah Nio JonahN@tca.gov.au; Author author@noreply.github.com Subject: Re: [uber/h3] Scala: H3 Indexes - multiPolygonToH3 function (#356) Res 12 is very fine grained. I wonder if what you're actually seeing is some sort of silent failure inside of Spark? As you said, there should be millions of records, and the current polyfill algorithm is very conservative in its initial estimation for memory allocation to make sure there isn't an overflow -- but if it literally couldn't allocate enough memory, I wonder what all of those layers on top of H3 will do with that? Manually editing this tutorialhttps://observablehq.com/@nrabinowitz/h3-radius-lookup?collection=@nrabinowitz/h3-tutorial shows you get over 130k hexagons at res 12 in just ~4km radius: [Screenshot from 2020-06-06 19-01-44]https://user-images.githubusercontent.com/765551/83958590-5ff0db80-a828-11ea-86e9-ca41e0c528f2.png There are unfortunately so many layers in the tutorial above H3 that I can't really say with any certainty where the issue is arising. I suspect an OOM happened and something "ate" that error, but I don't know for sure. Have you tried taking your GeoJSON and and feeding it more directly at H3 without the layers in between? That might be more illuminating. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#356 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALYDSPHVG36XW76YAW3XH5LRVLYVVANCNFSM4NWSN7IQ.

Unfortunately your image did not make it to the issue. I think you'll need to visit the github issue page to attach it.

Is the overall "shape" of the two polygons the same?

By that, I mean that the most extremely distant lat and lng coordinate pairs are represented in both?

A new, faster, less memory intensive algorithm is coming soon, but right now, the polyfill algorithm is roughly this:

  1. Scan through all of the polygon points and define a bounding box around it.
  2. From the center of the bounding box, define a circle that barely encompasses the bounding box.
  3. Compute a k-ring that surrounds that circle completely.
  4. Perform a point-in-poly check on each hexagon, zeroing out any IDs that fail the check.

So even if the area of the two polygons is similar, they could be very different in terms of their bounding box area, causing a significantly higher memory consumption in one polygon versus the other.

A huge number of polygon points should have an impact on the speed of the point-in-poly algorithm, but it would need to be an enormous number of points to cause an OOM there and I have no idea how it would return anything in that scenario.

jonahnio commented 4 years ago

Hi David

Thanks so much.

This is the result that I ran for the polygon. muiltiploy

Thanks

isaacbrodsky commented 4 years ago

The geometry works as a Polygon but not when treated as a MultiPolygon with holes? This suggests to me the library is failing to parse one or more of the holes correctly and excluding too much based on that. Is it possible to try adding some of the holes (rather than simplifying the hole geometry) and seeing if the results are as expected?

I would expect if the issue were out-of-memory that the code would either crash or return no results, but it is possible it could result in partial results from the multipolygon.

jonahnio commented 4 years ago

Hi Isaac

As shown below is an example of the mulitpolygon I'm trying to process:

val sMPOLY = """MULTIPOLYGON (((150.8638660205 -34.626873658, 150.8653900415 -34.6262575525, 150.8649682475 -34.6246043185, 150.8638660205 -34.626873658)), ((150.803508708 -34.6432379995, 150.8137233425 -34.641522864500004, 150.8152121775 -34.636833299500005, 150.8196366345 -34.640336589, 150.8267324535 -34.6394182675, 150.8275201235 -34.637432126, 150.8320248075 -34.6391322945, 150.8351954895 -34.633113967, 150.839224834 -34.6333911895, 150.8440909045 -34.6307950475, 150.8461778285 -34.631585460000004, 150.8434947865 -34.6359436715, 150.847689549 -34.634923452, 150.8490133675 -34.6274086225, 150.845666938 -34.6234627205, 150.843492341 -34.62391462, 150.845561672 -34.621505087500005, 150.837695411 -34.6146076955, 150.83766205 -34.611693483, 150.842142498 -34.6151668025, 150.848533283 -34.614319669000004, 150.8487432675 -34.6173327455, 150.8523885955 -34.618387412000004, 150.8607829385 -34.6278808535, 150.861700658 -34.626385332, 150.8568145855 -34.621334647000005, 150.855835181 -34.615483467000004, 150.861472752 -34.610584741000004, 150.865640103 -34.61026273, 150.8660100305 -34.604644317, 150.8711823725 -34.603776963, 150.875469626 -34.6079526905, 150.886022798 -34.6062351505, 150.8816832765 -34.604223997, 150.887514918 -34.602602805000004, 150.8877207415 -34.598358646, 150.89477057050001 -34.601099051, 150.903091147 -34.598892852, 150.899813155 -34.597206577, 150.904362004 -34.597251606, 150.901488067 -34.5944388475, 150.9035175765 -34.5934922765, 150.901166502 -34.5926089015, 150.8987764455 -34.5923420575, 150.8955466335 -34.594716329, 150.888306238 -34.592621204000004, 150.885227244 -34.5940020995, 150.8860625465 -34.5912324645, 150.883641027 -34.5939792705, 150.875123022 -34.589012372, 150.8737789825 -34.5826409905, 150.877206369 -34.580253343500004, 150.8726233195 -34.579178512, 150.871602378 -34.5770902875, 150.874421054 -34.5758541545, 150.8690536925 -34.5728919715, 150.868152179 -34.568657710000004, 150.869578745 -34.565530211, 150.8754424335 -34.561326937000004, 150.8699127565 -34.559582017000004, 150.8691080775 -34.5534751485, 150.871120067 -34.548659413500005, 150.87490394900001 -34.545632462, 150.880488522 -34.547208588000004, 150.8824971535 -34.545863582500004, 150.87600847550002 -34.5428008335, 150.870841207 -34.5423686365, 150.8673700205 -34.536059859, 150.85879211900001 -34.5299792235, 150.8039768935 -34.5457643115, 150.806502657 -34.5474856625, 150.806678368 -34.550413676, 150.80184679 -34.5528916215, 150.79582359650001 -34.549953396, 150.7912304 -34.544953179000004, 150.780953168 -34.547165779000004, 150.7768819945 -34.551727583, 150.7707540095 -34.5535122595, 150.7665573855 -34.552358359, 150.7643327105 -34.5536820155, 150.762260496 -34.551655063, 150.756766589 -34.551597676, 150.7546899945 -34.550134548, 150.752535071 -34.550615659, 150.75270191250002 -34.5522618815, 150.7466828435 -34.5537252685, 150.7459170735 -34.552067761, 150.742199001 -34.552782009, 150.742313976 -34.551214485500005, 150.7385760475 -34.549035944, 150.728732034 -34.549646222, 150.720020214 -34.5459686255, 150.71283478750001 -34.548981813000005, 150.709809996 -34.546782792, 150.6971200045 -34.54739788, 150.6945958105 -34.546063882, 150.693245858 -34.5477970175, 150.6894336885 -34.545707868, 150.687019907 -34.5488725335, 150.6835382085 -34.5469645915, 150.6851429675 -34.5447737845, 150.6816956885 -34.5429728835, 150.678496573 -34.546224536000004, 150.681269478 -34.5513665555, 150.6806431015 -34.5533216355, 150.671068823 -34.552668123000004, 150.666388136 -34.548717226, 150.655267608 -34.5466067645, 150.647474566 -34.555580763, 150.6418773275 -34.557627066500004, 150.639803361 -34.560452368, 150.6379171505 -34.571058307, 150.6383684365 -34.574548332, 150.6407721075 -34.5774781955, 150.643813981 -34.579713865, 150.655623629 -34.582312449, 150.6618944385 -34.580272454, 150.664162001 -34.581162674, 150.6632075625 -34.583933234, 150.6677514475 -34.591268206500004, 150.6819524295 -34.601518982500004, 150.6885641855 -34.601615719, 150.693790949 -34.603927072000005, 150.6985856255 -34.600760741500004, 150.700451542 -34.604969510000004, 150.7581312145 -34.613022671, 150.7599905245 -34.605954302, 150.7666356415 -34.606761771500004, 150.7683675665 -34.598566475, 150.7743053135 -34.600604139, 150.790171462 -34.602265994, 150.788127389 -34.6135016175, 150.7958736745 -34.6146092125, 150.793002986 -34.6287343325, 150.799275511 -34.629579098, 150.796952359 -34.6413287995, 150.803508708 -34.6432379995)))"""

multipolygon98

The result :

result98

Thanks

dfellis commented 4 years ago

Seeing the polygon immediately gives me an idea of what's going on.

That multipolygon isn't 1 polygon with n holes, two 2 polygons with zero holes.

Do the 77 hexagons correspond to only the small area over here?

just_here

No matter what, this particular multipolygon needs to be decomposed into separate polygons. The polyfill implementation in H3 works only with 1 filled polygon with 0 or more holes to subtract from it, and the notebook you've referenced has done you a disservice to make it seem like it handles a "Multipolygon" in the same terminology as GeoJSON.

jonahnio commented 4 years ago

Hi David,

Thanks a lot for the reply. Yes for the time being I decided to split the multi polygons into individual polygons and seems to work fine as shown below:

result98_split

Is there a way / viewer where you input the H3_index and you can see the hexagon on the map?

Many thanks

dfellis commented 4 years ago

So if you edit this tutorial's hexagon JSON object (key is the H3 index, value is the darkness of the hexagon) you'll get exactly that.

jonahnio commented 4 years ago

Thanks a lot David

Jonah Nio Senior Data Insights Analyst D: +61 3 8601 4678 E: JonahN@tca.gov.au Transport Certification Australia ​Level 6, 333 Queen Street, Melbourne VIC 3000 ​T: + 61 3 8601 4600 Want to stay connected and get updates from TCA? Subscribe with us, email tca@tca.gov.au TCA Disclaimer: This email and any files transmitted with the email may be confidential to the intended recipient and may also be privileged. If you are not the intended recipient, any use, disclosure or reproduction of this email or file is unauthorised and any record of this email should be deleted. While we use antivirus software to alert us to the presence of computer viruses, we cannot guarantee that this email and any ​attachment are free from them. All liability is disclaimed for any virus, defect, error or interference caused or communicated by this email or any file and any resulting loss, cost or damage. From: David Ellis notifications@github.com Sent: Tuesday, 9 June 2020 3:05 PM To: uber/h3 h3@noreply.github.com Cc: Jonah Nio JonahN@tca.gov.au; Author author@noreply.github.com Subject: Re: [uber/h3] Scala: H3 Indexes - multiPolygonToH3 function (#356)

So if you edit this tutorial's hexagon JSON objecthttps://observablehq.com/@nrabinowitz/h3-tutorial-heatmap-rendering?collection=@nrabinowitz/h3-tutorial (key is the H3 index, value is the darkness of the hexagon) you'll get exactly that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/uber/h3/issues/356#issuecomment-641030102, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALYDSPCREVJXU7ZWBUOYXU3RVW7GRANCNFSM4NWSN7IQ.

jonahnio commented 4 years ago

Hi all,

I got this error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 120, 10.139.64.5, executor 0): org.apache.spark.SparkException: Failed to execute user defined function($anonfun$2: (struct, int) => array)

when trying to convert this polygon to H3 Index

POLYGON ((115.8718476065 -23.8878664705, 115.97983668900001 -23.887845732000002, 115.979710545 -23.577868982000002, 116.0761394205 -23.577868353, 116.07613971250001 -23.602588830000002, 116.136906847 -23.602588256500002, 116.136907139 -23.631462057500002, 116.1906008655 -23.6314620205, 116.1906010845 -23.655243752, 116.297165791 -23.655243974, 116.297164185 -23.5983613765, 116.4868503415 -23.598298735500002, 116.486898266 -23.509780768000002, 116.5983568195 -23.5097810085, 116.59836036 -23.570821518000002, 116.7338597115 -23.5708856205, 116.86451982 -23.570835578, 116.8645266455 -23.8046977595, 117.044741271 -23.8046676045, 117.04183287800001 -23.8002202785, 117.0414171795 -23.7987422395, 117.0465970405 -23.7950094945, 117.0488544925 -23.7927063925, 117.195648258 -23.7929105955, 117.337549162 -23.792975253, 117.479450066 -23.7929094855, 117.62624379500001 -23.792704117, 117.626243284 -23.740030251, 117.71438709750001 -23.740014045000002, 117.7143721325 -23.678736569, 117.743844021 -23.618027302, 117.74384234200001 -23.5328983495, 117.658187062 -23.5328984975, 117.65818709850001 -23.4854946895, 117.6366501275 -23.485494819, 117.63664870400001 -23.437781173, 117.7159803955 -23.437780692, 117.715979264 -23.378367202, 117.8403045785 -23.3783662955, 117.8403048705 -23.393591666, 117.991601349 -23.39358989, 117.99160113 -23.374975338000002, 118.07899165650001 -23.374974635, 118.0789904155 -23.2932868675, 118.167479957 -23.293286997, 118.1674804315 -23.3150998845, 118.2991977625 -23.3151561245, 118.4260367225 -23.315105675, 118.426041358 -23.4403937245, 118.5298695865 -23.252940754, 118.6312922095 -23.0692527, 118.73257846850001 -22.8852432085, 118.835784883 -22.697162681000002, 119.124268911 -22.697161034500002, 119.12426631950001 -22.545784433, 119.0126363985 -22.5457844515, 119.0126362525 -22.465441671, 118.862124451 -22.4654418005, 118.862124378 -22.431966957, 118.75847299200001 -22.431966846, 118.758473065 -22.3634795685, 118.8620903235 -22.363479439, 118.862090287 -22.3905178365, 119.124262633 -22.390513378, 119.1242620855 -22.3634732045, 119.19270991500001 -22.363471188000002, 119.19270991500001 -22.365294751500002, 119.426817119 -22.3652945665, 119.42681438150001 -22.293565645, 119.3975705815 -22.293565941, 119.39756963250001 -22.2386126715, 119.3976573055 -22.219668912, 119.3792142935 -22.2196688935, 119.379214038 -22.206967552000002, 119.179967327 -22.206968606500002, 119.1799671445 -22.175212024500002, 119.2312940285 -22.1752114695, 119.2312932985 -22.066122575, 118.987668829 -22.0661264785, 118.987668756 -22.0557696605, 118.8095052115 -22.0557719545, 118.809505175 -22.050296398500002, 118.7265785985 -22.050296306, 118.726579949 -22.035708223500002, 118.6973509315 -22.035711442500002, 118.697351041 -21.960299485500002, 118.58843912900001 -21.9603726345, 118.5884426695 -21.8098454605, 118.53182916200001 -21.809845701, 118.531828797 -21.768789687, 118.551070794 -21.7687892245, 118.5510687865 -21.6961713225, 118.502230728 -21.6961715075, 118.502232115 -21.657653786, 118.36970203850001 -21.6576569495, 118.3697015275 -21.68454694, 118.2534862575 -21.6845494005, 118.25348567350001 -21.608060984, 118.199091001 -21.6080611505, 118.19908994250001 -21.530120558, 118.23093422150001 -21.530119152, 118.23093352800001 -21.489966937000002, 118.205115472 -21.489968898, 118.205114304 -21.386950834, 118.1560349075 -21.386951056, 118.1560354915 -21.479249073000002, 118.0853783535 -21.479248777000002, 118.08537806150001 -21.4165663185, 118.1109898925 -21.4165662815, 118.110990367 -21.3451628305, 118.1464804485 -21.3451627565, 118.146481945 -21.2964680925, 118.069525856 -21.29647281, 118.069525564 -21.649400307, 118.1095521575 -21.649399900000002, 118.10955073400001 -21.727517001, 117.9829730775 -21.727518148, 117.98297472 -21.482221949, 117.924058099 -21.482221838, 117.924058172 -21.3715068305, 117.70162078450001 -21.37150696, 117.7016136305 -21.151281961000002, 117.5465971815 -21.151282812, 117.54659845900001 -21.237091269500002, 117.5496966155 -21.2370911955, 117.549697236 -21.251626627, 117.5465987145 -21.251626775000002, 117.54659732750001 -21.3079656205, 117.407209777 -21.3080240805, 117.220520526 -21.307967119, 117.2205220225 -21.252455575, 117.2932243285 -21.2524556675, 117.29321972950001 -21.184295692, 117.1480006105 -21.1842948965, 117.148001742 -21.25249535, 117.08828927500001 -21.252495017, 117.08828591700001 -21.191630054, 117.1134700775 -21.1916298875, 117.11346967600001 -21.162601297000002, 117.018190003 -21.162630897, 117.018189711 -21.211738, 116.7979864155 -21.211742588, 116.79798324000001 -21.336807287, 116.6252288495 -21.336898584500002, 116.45323125 -21.336813096, 116.45323249100001 -21.3172707875, 116.313981779 -21.317330542500002, 116.1747310305 -21.3172750055, 116.1747333665 -21.3920524495, 116.112533023 -21.3920539665, 116.11254452050001 -21.5312987305, 116.008255954 -21.531300969, 116.00825679350001 -21.5801646675, 115.9607460575 -21.5801646305, 115.96074697 -21.47520279, 115.94614594800001 -21.466064197, 115.88786581400001 -21.440605533, 115.887865741 -21.409352280500002, 115.5372225415 -21.409350967, 115.5364128985 -21.409843770000002, 115.5348817235 -21.409971383000002, 115.53475411950001 -21.4094609865, 115.535774915 -21.408312599000002, 115.537816506 -21.4074194005, 115.53896490550001 -21.406781391, 115.5384544895 -21.4057605795, 115.535774915 -21.406653796500002, 115.5320745085 -21.407546995, 115.5285017425 -21.4086953825, 115.526587719 -21.4104817795, 115.5260773395 -21.4140545735, 115.527608551 -21.4167341505, 115.52965014200001 -21.4180101695, 115.5302438875 -21.418197667, 115.5267036065 -21.421712223, 115.4682997375 -21.3588405465, 115.45750610350001 -21.341803064, 115.451989457 -21.338203334, 115.418001022 -21.3320195425, 115.3333333225 -21.0833333295, 115.439859412 -21.022456452500002, 115.550391121 -20.95916712, 115.6608297915 -20.89580573, 115.771175752 -20.8323726155, 115.6813406015 -20.695377433, 115.59166667150001 -20.558333336500002, 115.438333347 -20.558333336500002, 115.438333347 -20.4866666675, 115.4200000175 -20.4866666675, 115.4200000175 -20.4340662095, 115.419060617 -20.4380203995, 115.418520782 -20.441492905, 115.418347772 -20.443240378000002, 115.41819673500001 -20.446747053, 115.418328792 -20.450631757, 115.41583701 -20.453158709, 115.4135354295 -20.4558411905, 115.4124593365 -20.457236738, 115.409874005 -20.4610491255, 115.4083549115 -20.463619423, 115.406654559 -20.4662065185, 115.4049322335 -20.469201206, 115.403263855 -20.472626407, 115.401006111 -20.477793087000002, 115.40034378200001 -20.479536693500002, 115.399226079 -20.483086991500002, 115.398772238 -20.484888873, 115.39807760650001 -20.488533817, 115.397850686 -20.490149903, 115.39756456250001 -20.4933985955, 115.397502841 -20.496657962500002, 115.397044401 -20.502887264, 115.3970085215 -20.5065411805, 115.39725500600001 -20.510187845, 115.3978879525 -20.5150320885, 115.398178091 -20.516718493, 115.39894024750001 -20.5200585385, 115.39994188050001 -20.523341974, 115.40179962100001 -20.52861662, 115.4025342565 -20.530348849, 115.40420595650001 -20.5337301865, 115.40613969 -20.536985206, 115.40832487200001 -20.540095907, 115.410438514 -20.5427633295, 115.4127353495 -20.5452928345, 115.385113062 -20.639998793, 115.3797120475 -20.645457921000002, 115.3784562285 -20.6468579085, 115.37611154150001 -20.649782592, 115.3750258125 -20.651303403, 115.37344981550001 -20.6537379845, 115.3694316765 -20.6608007295, 115.36665099700001 -20.666751199, 115.3655106275 -20.66968541, 115.363200214 -20.671780387000002, 115.360795229 -20.6741350855, 115.35855799800001 -20.676632012, 115.35517182000001 -20.6806349495, 115.353043286 -20.683485189, 115.3520591 -20.684960730500002, 115.35112995600001 -20.6864675185, 115.3486413495 -20.6909284975, 115.3448181935 -20.69150024, 115.34104916700001 -20.692331519, 115.33735412600001 -20.693417820500002, 115.334046642 -20.694611718, 115.32931616900001 -20.696654599000002, 115.3275495325 -20.697520769, 115.3241254675 -20.6994361665, 115.3208628055 -20.7015866805, 115.3179355055 -20.703796413, 115.31532513500001 -20.706020409, 115.3120536035 -20.7080372235, 115.3089799385 -20.710230416999998, 115.3061116225 -20.712580583, 115.30349668950001 -20.7150167555, 115.301066483 -20.7176167455, 115.299924252 -20.7189743495, 115.29774487350001 -20.721844550500002, 115.296714077 -20.723356574, 115.294828268 -20.726479855, 115.293184673 -20.729722017, 115.29179176000001 -20.733066465500002, 115.2906566465 -20.7364961065, 115.2897851725 -20.739993328, 115.289054406 -20.7435169675, 115.2883869305 -20.7460682655, 115.287753765 -20.7491771535, 115.28724491850001 -20.7527625275, 115.2869868635 -20.756284835, 115.2869567875 -20.758050216, 115.287042088 -20.760999819000002, 115.287421761 -20.764830096, 115.284903918 -20.767629257, 115.28261018500001 -20.770593956, 115.2807637595 -20.7733402995, 115.27863752500001 -20.7762863505, 115.2767386855 -20.779366767, 115.2750767675 -20.7825660275, 115.2717152635 -20.7883105365, 115.270227195 -20.7911435895, 115.2682956515 -20.795380885, 115.26659610200001 -20.7998445465, 115.261659112 -20.808074160500002, 115.259192807 -20.812768517000002, 115.2577890535 -20.8158810125, 115.2571444635 -20.817507810000002, 115.2546777205 -20.8248112215, 115.2531780815 -20.829900534500002, 115.251357571 -20.8377489375, 115.2508539075 -20.8409888795, 115.2504895645 -20.844441812, 115.250241182 -20.847914354500002, 115.25013474800001 -20.852809325, 115.25022888149999 -20.8560089, 115.250392876 -20.8582121205, 115.250201616 -20.8627012565, 115.250324256 -20.86743487, 115.250596327 -20.870846825, 115.2514846275 -20.877272115500002, 115.252239703 -20.881299843, 115.25305993100001 -20.884705545, 115.254072806 -20.8879911265, 115.25537213300001 -20.891370984, 115.25698908300001 -20.894816923500002, 115.257899612 -20.8964946885, 115.259812942 -20.899624019, 115.26184639350001 -20.902498364, 115.260460671 -20.9054051395, 115.2592738735 -20.9083888935, 115.25818613700001 -20.9117119375, 115.257345031 -20.915098899, 115.25675701600001 -20.918533054, 115.2565581275 -20.9202610835, 115.256346245 -20.923306868, 115.256293247 -20.9264348665, 115.25634909200001 -20.928249809, 115.25661032250001 -20.931790783, 115.25530427950001 -20.935886516500002, 115.25455851150001 -20.9388783735, 115.2540094055 -20.9419074155, 115.2535695075 -20.945324199, 115.249463586 -20.946953309, 115.24475650950001 -20.94928664, 115.2414122335 -20.9512799225, 115.239801379 -20.952363079, 115.2367912605 -20.9546163605, 115.2340331015 -20.956985452, 115.232721109 -20.958237495000002, 115.2302430145 -20.9608646245, 115.2278498555 -20.963682859000002, 115.226854829 -20.964982928, 115.2249459155 -20.9677441825, 115.223185849 -20.9706931565, 115.22238584200001 -20.972208214000002, 115.2208808375 -20.975402424000002, 115.2195508505 -20.9787832435, 115.21848370000001 -20.982245148500002, 115.218049861 -20.9840012055, 115.2173849405 -20.9875510595, 115.2169931495 -20.991135823, 115.216900184 -20.9929356695, 115.21689368700001 -20.996357337, 115.2169666505 -20.997975865, 115.21709524 -20.999591359, 115.2175186035 -21.0028064, 115.0859783705 -21.1297786155, 114.98520387800001 -21.226837256, 114.8624929585 -21.3485602815, 114.72990674500001 -21.4797609125, 114.7267585835 -21.482160214500002, 114.49875961800001 -21.5475179395, 114.494870835 -21.5480097805, 114.4916247805 -21.548600559500002, 114.48684401050001 -21.549834491000002, 114.3868530635 -21.571822037, 114.38481388150001 -21.571943064, 114.38130999100001 -21.5723161165, 114.3777919385 -21.57289389, 114.374160006 -21.573710924, 114.431709994 -21.609489998, 114.45819001400001 -21.7068500035, 114.42869998500001 -21.7475599935, 114.359387215 -21.909562773, 114.276641934 -22.174241126000002, 114.405848065 -22.261288880000002, 114.414016108 -22.2704641585, 114.4272273195 -22.277566105000002, 114.5588610655 -22.2777971515, 114.5586571035 -22.504629666, 114.636845761 -22.504691197, 114.63673629750001 -22.668239448, 114.753994993 -22.668239004, 114.753995504 -22.7053589395, 114.7504502225 -22.716301153, 114.74191780000001 -22.729351941, 114.72468640550001 -22.762813205500002, 114.7014847425 -22.80077219, 114.692168665 -22.824007154, 114.68172309500001 -22.856871627, 114.6786876455 -22.8884672035, 114.671929488 -22.9061727765, 114.709336659 -22.906173239, 114.709336659 -22.9329997375, 114.661675397 -22.933029245, 114.655656182 -22.9487890805, 114.6447713345 -22.968800549, 114.59770411000001 -22.9688006415, 114.5977046575 -23.0182921375, 114.479395346 -23.018293136500002, 114.47938585600001 -23.073805901500002, 114.5450173455 -23.073724020500002, 114.545156593 -23.129541314, 114.6731146795 -23.129382547000002, 114.6733411255 -23.220085568000002, 114.63206046500001 -23.2200347855, 114.6320122485 -23.3310205035, 114.631929649 -23.331020448, 114.6319053035 -23.389711217000002, 114.7373841715 -23.3897158975, 114.737210468 -23.452349627, 115.04598105000001 -23.4523437625, 115.1245656595 -23.452278772, 115.19554447200001 -23.452335345, 115.1955434865 -23.3920611055, 115.478007825 -23.392059440500002, 115.4781186755 -23.484731213, 115.525725808 -23.484697321000002, 115.52572686650001 -23.5754168255, 115.6047008215 -23.5754152715, 115.6048361635 -23.67534785, 115.62692746 -23.690297977500002, 115.68953966950001 -23.7357518485, 115.793236498 -23.838567115500002, 115.8307956915 -23.880689081, 115.8718476065 -23.8878664705))

lga_452

Thanks alot

jonahnio commented 4 years ago

poly_error.txt

isaacbrodsky commented 4 years ago

Is there perhaps another error message in the executor log which indicates why the user-defined function wasn't executed?

dfellis commented 4 years ago

@isaacbrodsky this line in the posted txt file looks relevant:

Caused by: java.lang.NegativeArraySizeException
    at com.uber.h3core.H3Core.polyfill(H3Core.java:689)
isaacbrodsky commented 4 years ago

Oops, I missed that. Thanks for pointing it out.

The line that caused the exception reads:

        int sz = h3Api.maxPolyfillSize(verts, holeSizes, holeVerts, res);

        long[] results = new long[sz]; // <-- line 689

        h3Api.polyfill(verts, holeSizes, holeVerts, res, results);

I'm pretty sure this is happening because the number of possible cells returned by the polyfill exceeds 2^31. Unfortunately the library does not have very good support for polyfilling very large polygons, both because of the polyfill algorithm issue @dfellis mentioned, and the library not correctly handling cases like this.

I'd suggest processing large polygons in Spark by first splitting the polygon into smaller, manageable chunks (e.g. with something like https://gis.stackexchange.com/questions/189976/jts-split-arbitrary-polygon-by-a-line - split each polygon into tenth degree by tenth degree chunks) and then performing the polyfill. Attempting to polyfill large geometries in a single go in Spark may not work very well because each executor would need to have a very large amount of memory, which generally won't scale or work as well.

dfellis commented 4 years ago

That means it needed more than 2 billion hexagons to test for membership in the geofence! That's wild. @isaacbrodsky is that a limit for arrays in Java in general? It's not completely absurd since that limit is approximately 16GB of contiguous memory, so possible on modern servers where RAM can be over 100GB and limiting (like in this situation) when it doesn't need to be.

isaacbrodsky commented 4 years ago

@dfellis the length of a Java array is represented by a signed, 32-bit integer, so they cannot exceed 2,147,483,647. I'm pretty sure it can be worked around inside the Java library by not using Java arrays directly.

jonahnio commented 4 years ago

Hi Isaac and David, thanks so much for the help. You're right the polygon is too big for res = 12. I l used res = 10 and it runs successfully.

Thanks