SpatioTemporal / STARE

SpatioTemporal Adaptive Resolution Encoding, a unifying index for geo-located data.
Other
10 stars 7 forks source link

NonConvex hull flawed at boundaries #105

Open NiklasPhabian opened 3 years ago

NiklasPhabian commented 3 years ago

Consider the canton Tilarán in Costa Rica.

When we lookup the SIDs for the nonconvex hull (ring) at level 12, we get the following: image

However, we should get the following. image

It would seem like the sids/trixels on the boundary are treated differently and are not getting dissolved to sids/trixels inside the hull.

In pystare, we can achieve a workaround by throwing all the SIDs into a compressed range and expanding the intervals. But it seems like STARE should have produced the compressed range when looking up the SIDs for the nonconvex hull.

s_range = pystare.to_compressed_range(sids)
expanded = pystare.expand_intervals(s_range, -1, multi_resolution=True)

latitudes and longitudes for the boundary nodes of Tilarán of the above example are:

lon =
array([-85.05302284, -85.04985684, -85.048693  , -85.04581243,
       -85.03830776, -85.02631278, -85.01450956, -85.00872611,
       -84.99967414, -84.97818575, -84.96099141, -84.94492526,
       -84.93301499, -84.91281083, -84.89230342, -84.87637997,
       -84.86385434, -84.83998031, -84.81881745, -84.80287169,
       -84.78683681, -84.78713106, -84.78621694, -84.7858112 ,
       -84.78353255, -84.78040227, -84.7794882 , -84.77995189,
       -84.78271657, -84.79163924, -84.80622044, -84.82970213,
       -84.84403364, -84.85562289, -84.87043601, -84.88543644,
       -84.89975903, -84.91236046, -84.91586976, -84.92411915,
       -84.92997835, -84.94509027, -84.95679537, -84.97966613,
       -84.9962763 , -85.02011016, -85.02806078, -85.03598901,
       -85.03522654, -85.03063366, -85.0318376 , -85.02786452,
       -85.03998442, -85.05302284])

lat = 
array([10.53049262, 10.55936978, 10.58398848, 10.63505855, 10.63615554,
       10.63887553, 10.64879706, 10.65794272, 10.63450562, 10.60838865,
       10.59715617, 10.58199974, 10.56463154, 10.56456913, 10.56423019,
       10.55536106, 10.54880169, 10.52366129, 10.50953484, 10.49977391,
       10.49750424, 10.48580355, 10.47353652, 10.47357221, 10.46301754,
       10.44553789, 10.43562525, 10.41881895, 10.39774964, 10.3936116 ,
       10.38571898, 10.37309086, 10.36760165, 10.36099331, 10.36239345,
       10.37098167, 10.37659123, 10.38644135, 10.39384797, 10.39752225,
       10.39077118, 10.38136249, 10.37894119, 10.37712186, 10.37816975,
       10.37903033, 10.39470858, 10.42088795, 10.43799754, 10.45821507,
       10.48127308, 10.50238247, 10.51614325, 10.53049262])
NiklasPhabian commented 3 years ago

the wrong set of SIDs is:

array([3391927950747107338, 3391927538430246923, 3391928775380828171,
       3391929050258735115, 3391929325136642059, 3391929462575595531,
       3392919435357454347, 3392919985113268235, 3392920259991175179,
       3392920397430128651, 3392920672308035595, 3392920947185942539,
       3392921084624896011, 3392921222063849483, 3391924102456410124,
       3391927435351031820, 3391927675869200396, 3391927744588677132,
       3391927778948415500, 3391927813308153868, 3391927882027630604,
       3391927916387368972, 3391928500502921228, 3391928534862659596,
       3391928569222397964, 3391928603582136332, 3391928637941874700,
       3391928672301613068, 3391928706661351436, 3391928741021089804,
       3391928912819781644, 3391928947179520012, 3391928981539258380,
       3391929015898996748, 3391929187697688588, 3391929222057426956,
       3391929256417165324, 3391929290776903692, 3391929600014549004,
       3391929634374287372, 3391929703093764108, 3391929737453502476,
       3391929771813240844, 3391929806172979212, 3391929840532717580,
       3391930012331409420, 3391930081050886156, 3391930115410624524,
       3391930149770362892, 3391930218489839628, 3391930252849577996,
       3391930424648269836, 3391930459008008204, 3391930527727484940,
       3391930596446961676, 3391931524159897612, 3391931592879374348,
       3391931627239112716, 3391945817811058700, 3391945852170797068,
       3391945920890273804, 3392917511212105740, 3392917545571844108,
       3392917614291320844, 3392919229199024140, 3392919297918500876,
       3392919332278239244, 3392919400997715980, 3392919572796407820,
       3392919607156146188, 3392919675875622924, 3392919778954838028,
       3392919847674314764, 3392919882034053132, 3392919916393791500,
       3392919950753529868, 3392920122552221708, 3392920156911960076,
       3392920191271698444, 3392920225631436812, 3392920534869082124,
       3392920569228820492, 3392920603588558860, 3392920637948297228,
       3392920809746989068, 3392920844106727436, 3392920878466465804,
       3392920912826204172, 3392924658037686284, 3392924692397424652,
       3392924726757163020, 3392924761116901388, 3392924864196116492,
       3392925139074023436, 3392928231450476556, 3392929090443935756,
       3392929193523150860, 3392929227882889228, 3392929262242627596,
       3392929296602365964, 3392929468401057804, 3392929537120534540,
       3392929571480272908, 3392929743278964748])

The correct set of sids is

array([3391924102456410124, 3391927435351031820, 3391927538430246923,
       3391927675869200396, 3391927744588677132, 3391927778948415500,
       3391927813308153868, 3391927882027630604, 3391927916387368972,
       3391927950747107338, 3391928500502921226, 3391929050258735114,
       3391929600014549004, 3391929634374287372, 3391929703093764108,
       3391929737453502475, 3391930012331409420, 3391930081050886156,
       3391930115410624524, 3391930149770362892, 3391930218489839628,
       3391930252849577996, 3391930424648269836, 3391930459008008204,
       3391930527727484940, 3391930596446961676, 3391931524159897612,
       3391931592879374348, 3391931627239112716, 3391945817811058700,
       3391945852170797068, 3391945920890273804, 3392917511212105740,
       3392917545571844108, 3392917614291320844, 3392919229199024140,
       3392919297918500876, 3392919332278239244, 3392919400997715980,
       3392919435357454347, 3392919572796407820, 3392919607156146188,
       3392919675875622924, 3392919778954838028, 3392919847674314763,
       3392919985113268235, 3392920122552221707, 3392920259991175178,
       3392920809746989066, 3392924658037686283, 3392924864196116492,
       3392925139074023436, 3392928231450476556, 3392929090443935756,
       3392929193523150859, 3392929468401057804, 3392929537120534540,
       3392929571480272908, 3392929743278964748])
NiklasPhabian commented 3 years ago

this does not just affect NonConvexHull, but also ConvexHull

kuoks commented 3 years ago

I believe the boundary trixels are indeed special! We have trixels for the interior of a region and trixels for the exterior. These boundary trixels do not belong to either group, because only part of them is in the interior. Therefore, I believe it should stay this way so one could quickly outline the region using these boundary trixels.

Sent from my iPhone

On Oct 20, 2021, at 5:33 PM, Niklas Griessbaum @.***> wrote:

 this does not just affect NonConvexHull, but also ConvexHull

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

NiklasPhabian commented 3 years ago

I believe the boundary trixels are indeed special!

You are right. But this is an orthogonal discussion. Yes; we need methods to get the boundary SIDs. But if we ask STARE for the cover of a ring/hull, STARE should give us back the most compact SID cover. We are wasting space AND CPU cycles if we represent a cover by more SIDs than minimally needed.

I would assume that the current behavior is legacy-induced. HTM produced 3 sets of SIDs for a hull (we didn't have rings back then):

I'd assume that we get the boundary and exterior for free when we calculate the interior which causes this behavior. But this does not make it intuitive for the user.

These boundary trixels do not belong to either group, because only part of them is in the interior.

Yes and no. If we try to find the minimum set of SIDs at a given level that cover the ring, then they do very much belong to that set. If we talk about SIDs that are contained within the ring, then we should not return the SIDs on the boundary at all. All we do right now though is return them at the wrong (i.e. not minimal) resolution.

one could quickly outline the region using these boundary trixels.

Well. But we cannot. They are indistinguishable hiding in the set. Only once plotted it becomes obvious that those are indeed special.

kuoks commented 3 years ago

I had a talk with Mike about this. I agree with you that if one wants strictly the cover/boundary/interior they should get the cover/boundary/interior. Mike has a solution. He should be able to address this shortly.

On Oct 22, 2021, at 3:46 PM, Niklas Griessbaum @.***> wrote:

I believe the boundary trixels are indeed special! You are right. But this is an orthogonal discussion. Yes; we need methods to get the boundary SIDs.

But if we ask STARE for the cover of a ring/hull, STARE should give us back the most compact SID cover. We are wasting space AND CPU cycles if we represent a cover by more SIDs than minimally needed.

I would assume that the current behavior is legacy-induced. HTM produced 3 sets of SIDs for a hull (we didn't have rings back then):

the interior the boundary the exterior I'd assume that we get the boundary and exterior for free when we calculate the interior which causes this behavior. But this does not make it intuitive for the user. These boundary trixels do not belong to either group, because only part of them is in the interior.

Yes and no. If we try to find the minimum set of SIDs at a given level that cover the ring, then they do very much belong to that set. If we talk about SIDs that are contained within the ring, then we should not return the SIDs on the boundary at all. All we do right now though is return them at the wrong (i.e. not minimal) resolution.

one could quickly outline the region using these boundary trixels.

Well. But we cannot. They are indistinguishable hiding in the set. Only once plotted this becomes obvious.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SpatioTemporal/STARE/issues/105#issuecomment-949916977, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPQLX2D43DKSIF3N6EEPZTUIG5SNANCNFSM5GMSHMMA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

michaelleerilee commented 3 years ago

I think Kuo is on to the root cause here. There's a low-level function in the libray that determines whether a trixel is interior, exterior, or on the boundary. I believe that's leaking through here. In one of the functions, the code returns the union of the boundary and the interior. Maybe we should expose this lower-level control, i.e. return the interior, exterior, and boundary as requested.