Closed HansVRP closed 2 weeks ago
This issue is related to Cropsar. Specifically when there are multiple load_collections in one process graph.
j-24051737210441be92441beedd8ab5ce (60 polygons) j-240517e1cd064f9f88642efee3e8343a (6 polygons)
when a multipolygon job is ran, and one or more have an invalid geometry, or no data available, the entire job fails
as discussed, there are actually two independent and unrelated issues here:
NoDataAvailable
,
In terms of the feature collection, I am not sure if it is indeed the best default behavior.
If you do have a feature collection consisting out op 9 valid and 1 invalid polygon. Will it start processing the valid ones until it hits the invalid one and then fail the entire job?
With a feature collection, the processing of the geometries typically happens in parallel, and the failure will/should even happen before actual processing is done, so it's not that you lose results that already have been calculated
Note that for the 'empty datacube' issue, the error is thrown by the sentinelhub data loader. This was also an issue for cropsar 2, until someone realized that it did not occur on CDSE, and that actually everything was running faster there anyway. So maybe question is also if we can perhaps run cropsar1 on cdse as well?
@HansVRP do we have a job id or stack trace for the case where there's actually an invalid polygon? (The two mentioned above seem to be about sentinelhub empty datacube issue.)
It does not seem to be a good practice to patch over the sentinel error. As for where to run it, we can discuss it with Dennis and Jurgen. Maybe only the credits will be an issue. Could we point them towards the aggregator such that they can use the terrascope credits for cdse?
@Pratichhya probably has some useful job-ids
This was the one that failed: j-240511b94de2467eb8c0555a17cf49ea
Stack trace indicates that it's also coming from the sentinelhub specific part, so also for multipolygon error, testing on cdse can be relevant:
py4j.protocol.Py4JJavaError: An error occurred while calling o1681.datacube_seq.
: org.locationtech.jts.geom.TopologyException: found non-noded intersection between LINESTRING ( 658020.0000000001 5299279.999999999, 658050.0 5299279.999999999 ) and LINESTRING ( 658040.0000000001 5299270.0, 658040.0000000002 5299280.0 ) [ (658040.0000000002, 5299279.999999999, NaN) ]
at org.locationtech.jts.noding.FastNodingValidator.checkValid(FastNodingValidator.java:140)
at org.locationtech.jts.geomgraph.EdgeNodingValidator.checkValid(EdgeNodingValidator.java:81)
at org.locationtech.jts.geomgraph.EdgeNodingValidator.checkValid(EdgeNodingValidator.java:46)
at org.locationtech.jts.operation.overlay.OverlayOp.computeOverlay(OverlayOp.java:231)
at org.locationtech.jts.operation.overlay.OverlayOp.getResultGeometry(OverlayOp.java:183)
at org.locationtech.jts.operation.overlay.OverlayOp.overlayOp(OverlayOp.java:86)
at org.locationtech.jts.operation.overlay.snap.SnapIfNeededOverlayOp.getResultGeometry(SnapIfNeededOverlayOp.java:75)
at org.locationtech.jts.operation.overlay.snap.SnapIfNeededOverlayOp.overlayOp(SnapIfNeededOverlayOp.java:37)
at org.locationtech.jts.geom.Geometry.union(Geometry.java:1388)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionActual(CascadedPolygonUnion.java:386)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionOptimized(CascadedPolygonUnion.java:309)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionSafe(CascadedPolygonUnion.java:290)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.binaryUnion(CascadedPolygonUnion.java:221)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.binaryUnion(CascadedPolygonUnion.java:226)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.binaryUnion(CascadedPolygonUnion.java:202)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionTree(CascadedPolygonUnion.java:148)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.reduceToGeometries(CascadedPolygonUnion.java:261)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionTree(CascadedPolygonUnion.java:146)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.reduceToGeometries(CascadedPolygonUnion.java:261)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionTree(CascadedPolygonUnion.java:146)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.reduceToGeometries(CascadedPolygonUnion.java:261)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.unionTree(CascadedPolygonUnion.java:146)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.union(CascadedPolygonUnion.java:136)
at org.locationtech.jts.operation.union.CascadedPolygonUnion.union(CascadedPolygonUnion.java:68)
at org.locationtech.jts.operation.union.UnaryUnionOp.union(UnaryUnionOp.java:216)
at org.locationtech.jts.operation.union.UnaryUnionOp.union(UnaryUnionOp.java:113)
at org.locationtech.jts.geom.Geometry.union(Geometry.java:1477)
at org.openeo.geotrellissentinelhub.package$.dissolve(package.scala:147)
at org.openeo.geotrellissentinelhub.package$.simplify(package.scala:150)
at org.openeo.geotrellissentinelhub.PyramidFactory.datacube_seq(PyramidFactory.scala:336)
In the UDF inside CropSAR will apply a negative buffer (-10) on polygons that gives polygons that caues issues.
To avoid this, it would be nice to apply the negative buffering on the original raster using a convolutions. Negative buffering on polygons can be tricky.
Another way would be to apply negative buffering by converting the polygon to a Signed Distance Field first and then take an isoline on threshold -10
In the meanwhile, I'll see if it is possible clean the polygons, or give a more clear error.
@EmileSonneveld concerning negative buffering: I also remember discussions about this under internal ticket MKTP-256 you might want to check that one too for more background/context
@HansVRP Is it possible to change the negative buffering in the UDF to 10.01
or 9.99
?
A negative buffer with 10
(and 20
) cause invalid polygons, as the input mesh exists out of squares that are exactly 10 wide.
In the following screenshot,
Minimal example that triggers the same error:
https://openeo-dev.vito.be/ shows now a more clear error. Soon it will be on openeo.vito.be
/ openeo.cloud
too:
openeo.rest.OpenEoApiError: [400] InvalidGeometry: UDF returned invalid polygons. This could be due to the input or the code. Invalid index(es): [0] (ref: r-240624f31f1b48b0b08d8c259a9cb8dd)
Currently when a multipolygon job is ran, and one or more have an invalid geometry, or no data available, the entire job fails.
It would be better to catch these errors and ensure the user does get the output from the valid polygons.
In addition the user has to get a sign which polygons succeeded and which did not