turf-junkyard / turf-simplify

simplify geographic shapes
11 stars 5 forks source link

Guarantee that a [valid ]Polygon is always returned #6

Closed nevi-me closed 8 years ago

nevi-me commented 9 years ago

I had a look at https://github.com/mourner/simplify-js to see if I should raise the issue there, but I think this would be the correct place to log it.

In some instances, a GeoJSON Polygon with too little coordinates on the ring (<4) is returned, resulting in an error. It would be great if we detect this at simplifying, and adjust the tolerance so that a valid GeoJSON Polygon's still returned.

Example, simplifying the below Polygon will return an invalid Polygon, but will return a triangle (valid Polygon) only at a tolerance of 0.0052.

I realise this has to do with the size of the polygon, and that the developer can choose not to simplify if the combination of turf.area and Polygon.coordinates[n].length are the right mix, but I don't think I can come up with the right mix (I tried it before logging the issue).

What I would suggest is at the least checking if an invalid Polygon is returned (coordinates < 4), and adjusting tolerance, or just returning the input Feature as is.

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              26.148429528000065,
              -28.29755210099995
            ],
            [
              26.148582685000065,
              -28.29778390599995
            ],
            [
              26.149207731000047,
              -28.29773837299996
            ],
            [
              26.14925541100007,
              -28.297771688999944
            ],
            [
              26.149255844000038,
              -28.297773261999964
            ],
            [
              26.149276505000046,
              -28.29784835099997
            ],
            [
              26.14928482700003,
              -28.29787859399994
            ],
            [
              26.14928916200006,
              -28.29800647199994
            ],
            [
              26.14931069800008,
              -28.298641791999955
            ],
            [
              26.149339971000074,
              -28.298641232999955
            ],
            [
              26.151298488000066,
              -28.29860385099994
            ],
            [
              26.151290002000053,
              -28.298628995999934
            ],
            [
              26.151417002000073,
              -28.299308003999954
            ],
            [
              26.15159000400007,
              -28.299739003999946
            ],
            [
              26.151951998000072,
              -28.30051100299994
            ],
            [
              26.15206407200003,
              -28.30076885099993
            ],
            [
              26.152066543000046,
              -28.30077453499996
            ],
            [
              26.151987021000025,
              -28.300799009999935
            ],
            [
              26.149896693000073,
              -28.301442350999935
            ],
            [
              26.150354333000053,
              -28.30260575099993
            ],
            [
              26.14914131000006,
              -28.302975170999957
            ],
            [
              26.14836387300005,
              -28.302853868999932
            ],
            [
              26.147575408000023,
              -28.30269948399996
            ],
            [
              26.146257624000043,
              -28.302462392999928
            ],
            [
              26.14557943400007,
              -28.302181192999967
            ],
            [
              26.145492669000078,
              -28.302154609999945
            ],
            [
              26.144921243000056,
              -28.303395982999973
            ],
            [
              26.14482272200007,
              -28.30455853999996
            ],
            [
              26.14431040900007,
              -28.30451913099995
            ],
            [
              26.14429070400007,
              -28.304144747999942
            ],
            [
              26.143837504000032,
              -28.304144747999942
            ],
            [
              26.143613499000026,
              -28.304592757999956
            ],
            [
              26.14346312200007,
              -28.304893512999968
            ],
            [
              26.143260178000048,
              -28.304893512999968
            ],
            [
              26.143246374000057,
              -28.304893512999968
            ],
            [
              26.143147852000027,
              -28.304893512999968
            ],
            [
              26.14295080900007,
              -28.304834399999947
            ],
            [
              26.14200500000004,
              -28.30449942699994
            ],
            [
              26.14198529600003,
              -28.304420608999976
            ],
            [
              26.141525339000054,
              -28.304298579999966
            ],
            [
              26.141019783000047,
              -28.30416445299994
            ],
            [
              26.141118305000077,
              -28.304637356999933
            ],
            [
              26.140940966000073,
              -28.30512996599998
            ],
            [
              26.140376789000072,
              -28.306172836999963
            ],
            [
              26.140476282000066,
              -28.30621363399996
            ],
            [
              26.14041675800007,
              -28.306326533999936
            ],
            [
              26.140146555000058,
              -28.30640398099996
            ],
            [
              26.140073975000064,
              -28.306410747999962
            ],
            [
              26.137315367000042,
              -28.305189078999945
            ],
            [
              26.136645419000047,
              -28.304854104999947
            ],
            [
              26.135719315000074,
              -28.30451913099995
            ],
            [
              26.135515376000058,
              -28.304330879999952
            ],
            [
              26.13546315800005,
              -28.304282678999982
            ],
            [
              26.13558800000004,
              -28.30419999999998
            ],
            [
              26.137463000000025,
              -28.30242899999996
            ],
            [
              26.13794500000006,
              -28.30202799999995
            ],
            [
              26.13796479100006,
              -28.30201049699997
            ],
            [
              26.13798299700005,
              -28.302025000999947
            ],
            [
              26.139450004000025,
              -28.30074499999995
            ],
            [
              26.141302000000053,
              -28.29914199999996
            ],
            [
              26.141913997000074,
              -28.29862600399997
            ],
            [
              26.14212216900006,
              -28.29845037299998
            ],
            [
              26.144304360000035,
              -28.296499429999983
            ],
            [
              26.144799071000023,
              -28.29614006399993
            ],
            [
              26.145209090000037,
              -28.295759748999956
            ],
            [
              26.145465732000048,
              -28.295507246999932
            ],
            [
              26.14575028200005,
              -28.295352539999953
            ],
            [
              26.14589208800004,
              -28.295275441999934
            ],
            [
              26.146584820000044,
              -28.295135245999973
            ],
            [
              26.146587504000024,
              -28.295134702999974
            ],
            [
              26.146827588000065,
              -28.295606591999956
            ],
            [
              26.14685742000006,
              -28.29565372899998
            ],
            [
              26.14691261200005,
              -28.29574093599996
            ],
            [
              26.147077344000024,
              -28.296001226999977
            ],
            [
              26.147117344000037,
              -28.296041226999932
            ],
            [
              26.147907966000048,
              -28.29696016899993
            ],
            [
              26.147913396000035,
              -28.296966331999954
            ],
            [
              26.148429528000065,
              -28.29755210099995
            ]
          ]
        ]
      }
    }
  ]
}
nevi-me commented 9 years ago

I just cloned the repo, and noticed that work has partly been done for this on a32e8a15f7fc5916c2e580893010b67a6c9dadbf. It solves the case of a valid Polygon always being returned, but I would like it if we still explore the option of safeguarding against over-simplification if the Polygon's area is too small to warrant useful simplifying, or if the tolerance is suboptimal for the polygon.

@frankrowe what's your view on this? I see it relate to Turfjs/turf#247, which I also left a comment on.

frankrowe commented 9 years ago

I think the best case would be to return the most simplified polygon possible (i.e. a triangle) if the tolerance is high enough. This is what QGIS and others do iirc.

My fix just makes sure a valid polygon by the geojson spec is returned, not that it's necessarily a triangle.

The only fix I can think of right now is to check if the ring is simplified to two points, and if it is, reduce tolerance until the ring is simplified to three points. You could then return a valid triangle. Seems like an inefficient fix though.

frankrowe commented 9 years ago

So something like this works: 084cc96b21ea5fae42ef937f4c6c7ff073826ff4

It reduces your polygon to a triangle https://github.com/frankrowe/turf-simplify/blob/oversimplification-fix/test/fixtures/out/simple_out.geojson

It just reduces tolerance by 1 percent until a valid triangle is made. Not sure if best solution, thoughts?

nevi-me commented 9 years ago

:+1: It's a simple solution which will work, and it won't get stuck in an infinite loop as long as a valid polygon's supplied. @morganherlocker I noticed that we don't check the validity of the GeoJSON object supplied, so for example with @frankrowe 's fix, if I were to supply the below Feature, simplify gets stuck in an infinite loop. Any suggestions on what we could do? I haven't been following Turf for much, so I don't know if the onus rests on the developer to not shoot themselves in the foot.

{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[26.148429528000065,-28.29755210099995],[26.13546315800005,-28.304282678999982]]]},"properties":{}}
frankrowe commented 9 years ago

We could just check that the original polygon has at least 4 vertices. It would be nice if Turf validated features though. Could use something like geojsonhint to do that.

nevi-me commented 9 years ago

Do you know what expected behavior is in the rest of Turf? Do we throw an error on invalid inputs, or return the input without applying any function to it?

I use https://Github.com/rwt-to/GeoJSON-Tools internally on my projects to validate GeoJSON and various sundry things. I can add the basic checks type and coordinates in a separate PR if we want to keep turf-simplify with as few dependencies as possible.

morganherlocker commented 9 years ago

Do you know what expected behavior is in the rest of Turf? Do we throw an error on invalid inputs, or return the input without applying any function to it?

Generally, Turf assumes that the inputs are valid. This is important for code in hot loops, and performance intensive applications where millions or billions of features are being manipulated. It is a tradeoff between convenience and the range of practical uses for the library.

If I need to check if a feature is valid, I typically check for a specific issue, or run the feature through geojson-hint. The overhead of this validation is more than you sometimes want inside of an algorithm, but it can be worth it when the input comes from an unpredictable source. Also, using validation in an "as needed" fashion allows people to avoid checks against features they already know are valid.

It just reduces tolerance by 1 percent until a valid triangle is made. Not sure if best solution, thoughts?

My only concern is that this might negatively affect performance in unpredictable ways, but I'm not sure if I can think of an alternative. simplify-js is already very fast, so it might not matter in this case. Let's definitely run benchmarks (node bench.js) to make sure non of the changes have too large of an effect.

As far as functionality goes, I think this is great. Returning triangles (as opposed to the original features, or undefined) is probably the most intuitive option.

frankrowe commented 9 years ago

I added a check to my branch to check each ring has at least 4 points, so the loop won't get stuck on invalid polygons. 3127708ccf19195d17527f1a6b4d7f53fc1600fb

With any kind of normal tolerance (< 1) benchmarks are normal. High tolerances cause ops/sec to drop, but even with a tolerance of over 10000, I was able to get 1000+ ops/sec.

nevi-me commented 9 years ago

Generally, Turf assumes that the inputs are valid. This is important for code in hot loops, and performance intensive applications where millions or billions of features are being manipulated. It is a tradeoff between convenience and the range of practical uses for the library.

I think we can continue with assuming that inputs are valid, users can do their validations externally if they use untrusted data.

What @frankrowe has added to immediately return if length < 4 will solve the performance problem to an extent, I'll check the ops/sec before and after to see, but if we bench with the current test polygons, we'll never execute the reduction of tolerance as we'd already be returning valid polygons, so perhaps we can bench on the specific tests that @frankrowe will be adding?

nevi-me commented 8 years ago

closed by merge of #7