stactools-packages / sentinel2

stactools package for Sentinel-2
Other
16 stars 5 forks source link

Prune very large polygons #113

Closed matthewhanson closed 12 months ago

matthewhanson commented 1 year ago

Getting some generated Items with unusually large polygons.

As an example here is a tileInfo with a large tileDataGeometry (over 3000 points):

https:///roda.sentinel-hub.com/sentinel-s2-l2a/tiles/37/W/EP/2019/3/16/0/tileInfo.json

But it is actually not a complex looking polygon:

image

Here are some additional examples: https:///roda.sentinel-hub.com/sentinel-s2-l2a/tiles/7/V/EF/2019/3/16/0/tileInfo.json

https:///roda.sentinel-hub.com/sentinel-s2-l2a/tiles/27/W/VP/2019/3/26/0/tileInfo.json

https:///roda.sentinel-hub.com/sentinel-s2-l2a/tiles/27/W/VP/2019/3/16/0/tileInfo.json

matthewhanson commented 1 year ago

There are currently 59291 scenes affected by this problem, currently the largest source of failures when running against the entire archive.

gadomski commented 1 year ago

There's a tolerance argument that can be used to simplify the output geometry: https://github.com/stactools-packages/sentinel2/blob/e38be03164cee8f1111e93ea4711943d799e74fb/src/stactools/sentinel2/stac.py#L100

In local testing, I was able to get down to a simple five point geometry via --tolerance on the command line:

$ stac sentinel2 create-item . . --tolerance 0.01 2>/dev/null
$ jq .geometry.coordinates S2B_OPER_MSI_L2A_TL_SGS__20190316T131240_A010572_T37WEP.json 
[
  [
    [
      40.539615644930294,
      65.72605561889523
    ],
    [
      41.39255666409265,
      65.71506827539704
    ],
    [
      41.487571441091816,
      66.69921264081667
    ],
    [
      41.41469787851801,
      66.69998805472711
    ],
    [
      40.539615644930294,
      65.72605561889523
    ]
  ]
]

Screenshot 2023-08-22 at 11 42 42 AM

So I think this issue is invalid, already done, but @matthewhanson can you confirm?

matthewhanson commented 1 year ago

The default is 0.0001

https://github.com/stactools-packages/sentinel2/blob/e38be03164cee8f1111e93ea4711943d799e74fb/src/stactools/sentinel2/constants.py#L31C4-L31C41

Should we change the default?

gadomski commented 1 year ago

I guess so, I don't know what the "right" tolerance is for the simplification 🤷🏼 ... if the geometries are mostly/entirely straight lines, then 0.01 might be a good answer?

philvarner commented 12 months ago

https://github.com/stactools-packages/sentinel2/pull/129