opendatacube / datacube-wps

Web Processing Service running on opendatacube
Other
7 stars 3 forks source link

WIT produces empty parquet files #112

Closed benjimin closed 3 years ago

benjimin commented 3 years ago

WIT appears to be outputting many parquet files that contain only column headers, and no data rows.

Appears to be related to geopolygon transformations. Possibly a reprojection (CRS84 vs EPSG:4326 vs EPSG:3577) bug in ODC core 1.8.3 virtual products?

One example:

HTTP request (click to expand): lon/lat truncate to 153,-27

``` WIT geometry {"type":"FeatureCollection","features":[{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[153.11419609243018,-27.943968459307015],[153.11418026037686,-27.944067533788655],[153.11408534501948,-27.944406032766558],[153.11397845480127,-27.94468108683979],[153.1138239109341,-27.944940319235872],[153.11370525374073,-27.945352865469335],[153.113336165801,-27.945437775060576],[153.1132236059476,-27.945463789331665],[153.11317980662525,-27.94550832416196],[153.1129655100924,-27.945577724857877],[153.11282442246426,-27.945766041125466],[153.11239593946235,-27.94599905368535],[153.1119911634455,-27.946126290657798],[153.1119317826017,-27.946284974974514],[153.11192010168477,-27.946507071635224],[153.11149156061563,-27.946687204997378],[153.11102704273299,-27.946624126735532],[153.11053846910764,-27.946328401696828],[153.11044309876198,-27.94623329836489],[153.11047870675142,-27.946116937244014],[153.11057374420713,-27.945884197673944],[153.1109547231141,-27.945778136373114],[153.1113952028862,-27.945629721464883],[153.11160958731736,-27.945650697526997],[153.1118240138829,-27.945713978434124],[153.11206203879928,-27.94556572478906],[153.1126691552966,-27.945332565960793],[153.11274055431346,-27.945279629112335],[153.11301422759016,-27.9450573169761],[153.11326380724807,-27.944570632039923],[153.11331075758784,-27.94391490425093],[153.11315463293863,-27.943491047733573],[153.11321614717187,-27.943519905290685],[153.11337635932207,-27.943595059907967],[153.11352000937842,-27.943922192497535],[153.113470915394,-27.94412804096992],[153.1134180931629,-27.944349522425618],[153.1132241428111,-27.944867829941973],[153.11307092097988,-27.945177027266926],[153.11315295435244,-27.94531331521011],[153.11333720815253,-27.94531316294408],[153.11361354298518,-27.945267483690223],[153.1137974353714,-27.94492189963588],[153.11396462034557,-27.944526169823927],[153.11397956430238,-27.9444907969641],[153.11418545012623,-27.94400345630012],[153.11419609243018,-27.943968459307015]]]}}]} start {"type":"object","properties":{"timestamp":{"type":"string","format":"date-time","date-time":"2010-01-01T00:00"}}} end {"type":"object","properties":{"timestamp":{"type":"string","format":"date-time","date-time":"2011-01-01T00:00"}}} ```

Logs of WIT.process_data(data, parameters) seem to show the method call is invoked with the expected vector input, supplied in parameters['feature']:

Geometry({'type': 'Polygon', 'coordinates': (((153.11419609243018, -27.943968459307015), (153.11418026037686, -27.944067533788655), (153.11408534501948, -27.944406032766558), (153.11397845480127, -27.94468108683979), (153.1138239109341, -27.944940319235872), (153.11370525374073, -27.945352865469335), (153.113336165801, -27.945437775060576), (153.1132236059476, -27.945463789331665), (153.11317980662525, -27.94550832416196), (153.1129655100924, -27.945577724857877), (153.11282442246426, -27.945766041125466), (153.11239593946235, -27.94599905368535), (153.1119911634455, -27.946126290657798), (153.1119317826017, -27.946284974974514), (153.11192010168477, -27.946507071635224), (153.11149156061563, -27.946687204997378), (153.11102704273299, -27.946624126735532), (153.11053846910764, -27.946328401696828), (153.11044309876198, -27.94623329836489), (153.11047870675142, -27.946116937244014), (153.11057374420713, -27.945884197673944), (153.1109547231141, -27.945778136373114), (153.1113952028862, -27.945629721464883), (153.11160958731736, -27.945650697526997), (153.1118240138829, -27.945713978434124), (153.11206203879928, -27.94556572478906), (153.1126691552966, -27.945332565960793), (153.11274055431346, -27.945279629112335), (153.11301422759016, -27.9450573169761), (153.11326380724807, -27.944570632039923), (153.11331075758784, -27.94391490425093), (153.11315463293863, -27.943491047733573), (153.11321614717187, -27.943519905290685), (153.11337635932207, -27.943595059907967), (153.11352000937842, -27.943922192497535), (153.113470915394, -27.94412804096992), (153.1134180931629, -27.944349522425618), (153.1132241428111, -27.944867829941973), (153.11307092097988, -27.945177027266926), (153.11315295435244, -27.94531331521011), (153.11333720815253, -27.94531316294408), (153.11361354298518, -27.945267483690223), (153.1137974353714, -27.94492189963588), (153.11396462034557, -27.944526169823927), (153.11397956430238, -27.9444907969641), (153.11418545012623, -27.94400345630012), (153.11419609243018, -27.943968459307015)),)}, CRS('urn:ogc:def:crs:OGC:1.3:CRS84'))

But WIT.process_data has simultaneously been supplied an xarray of raster data with the wrong spatial extent, at least according to data.geobox: lon/lat truncate to 168,-28

GeoBox(Geometry({'type': 'Polygon', 'coordinates': (((168.58911176419144, -28.039700908122082), (168.58957458617093, -28.042830673835773), (168.59369002835174, -28.04235150968161), (168.59322709650263, -28.03922180675456), (168.58911176419144, -28.039700908122082)),)}, CRS('EPSG:4326')))

Consequently, after invoking a (via wrapper) rasterio.features.geometry_mask (to overlay the vector geometry feature onto the raster defined by the xarray geobox), WIT.process_data finds no pixels (and logs the polygon area as zero), and ultimately returns an empty pandas.DataFrame (with column headings but no index rows).

h/t @emmaai

benjimin commented 3 years ago

This appears to be resolved by 23c085e2ad86b (alas at the expense of #111). Ideally would still narrow this down (and feed into a regression test, to ensure compatibility with the next stable release of datacube-core).