Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
26 stars 4 forks source link

Temporal extent is null for vectorcube STAC items #852

Closed VincentVerelst closed 2 weeks ago

VincentVerelst commented 1 month ago

When downloading vectorcube assets like CSV or Parquet from OpenEO, the associated STAC items' temporal interval is null. Here is an example (ran on Terrascope backend): item-example.json

This makes it so that libraries like pystac don't consider this a valid STAC items and will throw errors.

bossie commented 3 weeks ago

From the item spec re: datetime:

null is allowed, but requires start_datetime and end_datetime from common metadata to be set.

bossie commented 3 weeks ago

@VincentVerelst just wondering: are you able to open this timeseries.parquet file? I'm getting this error:

shaded.org.apache.avro.SchemaParseException: Illegal character in: S1-SIGMA0-VV

It's probably the two dashes it's complaining about.

bossie commented 3 weeks ago

no temporal bounds e.g. load_stac/temporal_extent anywhere in the process graph, :arrow_down: no temporal_extent source constraints in the dry run, :arrow_down: start_datetime and end_datetime are both None for these results, :arrow_down: datetime is None as well. :boom:

This means that whether or not the result is a vector cube is irrelevant.

@VincentVerelst this can be worked around by passing an arbitrarily large temporal_extent to one of those load_stac processes e.g. ["1970-01-01", "2070-01-01"]

VincentVerelst commented 3 weeks ago

@VincentVerelst just wondering: are you able to open this timeseries.parquet file? I'm getting this error:

shaded.org.apache.avro.SchemaParseException: Illegal character in: S1-SIGMA0-VV

It's probably the two dashes it's complaining about.

What are you using to open the file? For me it works using GeoPandas (which itself uses PyArrow), but I think it's because it ignores the schema validation.

bossie commented 2 weeks ago

I was using the Java Parquet CLI; works with GeoPandas though 👍

bossie commented 2 weeks ago

Much like e.g. GeoTiff Items provide a datetime property in their asset metadata regardless of the temporal_extent source constraints in the process graph, this will populate start_datetime and end_datetime properties.

bossie commented 2 weeks ago

Fixed on https://openeo-dev.vito.be; the workaround is no longer necessary @VincentVerelst.