Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

add proj extension metadata to batch job results #72

Closed jdries closed 1 year ago

jdries commented 2 years ago

The proj extension provides relevant information that can help in constructing datacubes, but also tells us the pixel size of an output result: https://github.com/stac-extensions/projection

This looks like information that we have when writing a given output. No need to write properties that are exceptionally difficult for some reason.

jdries commented 1 year ago

the easiest approach seems to be using rasterio/gdal to retrieve metadata from every asset and add it to asset level metadata. This avoids having to add this through custom code in every code path, and reduces the test surface a lot.

This could happen in the batch job specific code path, right after write_assets has generated the files.

jdries commented 1 year ago

Asset metadata is added here: https://github.com/Open-EO/openeo-geopyspark-driver/blob/d7aa3755f3bd5dfbd4d19db9de20d66586b66648/openeogeotrellis/deploy/batch_job.py#L195

So before that, we need to loop over all assets, and if it is a raster asset, do gdalinfo to retrieve proj metadata.

If proj metadata is the same for all assets, it can be set at item level, otherwise it would be set at band level.

Example of retrieving gdal info in python below, this is using gdal 3.6, which already returns some stac metadata. Therefore I propose to first do basic metadata only, and then do gdal upgrade and augment metadata further.

Fields to include: proj:epsg proj:shape (pixel size of the asset) proj:bbox (bounding box in asset CRS)

>>> from osgeo import gdal
>>> info = gdal.Info("/home/driesj/data/ccn/creodias/Cropclass-Classification-E354N250-32.88.tif",options=gdal.InfoOptions(format="json"))
Warning 1: /home/driesj/data/ccn/creodias/Cropclass-Classification-E354N250-32.88.tif: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
>>> info
{'description': '/home/driesj/data/ccn/creodias/Cropclass-Classification-E354N250-32.88.tif', 'driverShortName': 'GTiff', 'driverLongName': 'GeoTIFF', 'files': ['/home/driesj/data/ccn/creodias/Cropclass-Classification-E354N250-32.88.tif', '/home/driesj/data/ccn/creodias/Cropclass-Classification-E354N250-32.88.tif.aux.xml'], 'size': [2000, 2000], 'coordinateSystem': {'wkt': 'PROJCRS["ETRS89-extended / LAEA Europe",\n    BASEGEOGCRS["ETRS89",\n        ENSEMBLE["European Terrestrial Reference System 1989 ensemble",\n            MEMBER["European Terrestrial Reference Frame 1989"],\n            MEMBER["European Terrestrial Reference Frame 1990"],\n            MEMBER["European Terrestrial Reference Frame 1991"],\n            MEMBER["European Terrestrial Reference Frame 1992"],\n            MEMBER["European Terrestrial Reference Frame 1993"],\n            MEMBER["European Terrestrial Reference Frame 1994"],\n            MEMBER["European Terrestrial Reference Frame 1996"],\n            MEMBER["European Terrestrial Reference Frame 1997"],\n            MEMBER["European Terrestrial Reference Frame 2000"],\n            MEMBER["European Terrestrial Reference Frame 2005"],\n            MEMBER["European Terrestrial Reference Frame 2014"],\n            ELLIPSOID["GRS 1980",6378137,298.257222101,\n                LENGTHUNIT["metre",1]],\n            ENSEMBLEACCURACY[0.1]],\n        PRIMEM["Greenwich",0,\n            ANGLEUNIT["degree",0.0174532925199433]],\n        ID["EPSG",4258]],\n    CONVERSION["Europe Equal Area 2001",\n        METHOD["Lambert Azimuthal Equal Area",\n            ID["EPSG",9820]],\n        PARAMETER["Latitude of natural origin",52,\n            ANGLEUNIT["degree",0.0174532925199433],\n            ID["EPSG",8801]],\n        PARAMETER["Longitude of natural origin",10,\n            ANGLEUNIT["degree",0.0174532925199433],\n            ID["EPSG",8802]],\n        PARAMETER["False easting",4321000,\n            LENGTHUNIT["metre",1],\n            ID["EPSG",8806]],\n        PARAMETER["False northing",3210000,\n            LENGTHUNIT["metre",1],\n            ID["EPSG",8807]]],\n    CS[Cartesian,2],\n        AXIS["northing (Y)",north,\n            ORDER[1],\n            LENGTHUNIT["metre",1]],\n        AXIS["easting (X)",east,\n            ORDER[2],\n            LENGTHUNIT["metre",1]],\n    USAGE[\n        SCOPE["Statistical analysis."],\n        AREA["Europe - European Union (EU) countries and candidates. Europe - onshore and offshore: Albania; Andorra; Austria; Belgium; Bosnia and Herzegovina; Bulgaria; Croatia; Cyprus; Czechia; Denmark; Estonia; Faroe Islands; Finland; France; Germany; Gibraltar; Greece; Hungary; Iceland; Ireland; Italy; Kosovo; Latvia; Liechtenstein; Lithuania; Luxembourg; Malta; Monaco; Montenegro; Netherlands; North Macedonia; Norway including Svalbard and Jan Mayen; Poland; Portugal including Madeira and Azores; Romania; San Marino; Serbia; Slovakia; Slovenia; Spain including Canary Islands; Sweden; Switzerland; Türkiye (Turkey); United Kingdom (UK) including Channel Islands and Isle of Man; Vatican City State."],\n        BBOX[24.6,-35.58,84.73,44.83]],\n    ID["EPSG",3035]]', 'dataAxisToSRSAxisMapping': [2, 1]}, 'geoTransform': [3540000.0, 10.0, 0.0, 2540000.0, 0.0, -10.0], 'metadata': {'': {'AREA_OR_POINT': 'Area', 'PROCESSING_SOFTWARE': '0.6.7a1'}, 'IMAGE_STRUCTURE': {'COMPRESSION': 'DEFLATE', 'INTERLEAVE': 'BAND'}}, 'cornerCoordinates': {'upperLeft': [3540000.0, 2540000.0], 'lowerLeft': [3540000.0, 2520000.0], 'lowerRight': [3560000.0, 2520000.0], 'upperRight': [3560000.0, 2540000.0], 'center': [3550000.0, 2530000.0]}, 'wgs84Extent': {'type': 'Polygon', 'coordinates': [[[-0.0119309, 45.4992529], [0.0206735, 45.3208707], [0.2737012, 45.3446338], [0.2418987, 45.5231053], [-0.0119309, 45.4992529]]]}, 'bands': [{'band': 1, 'block': [256, 256], 'type': 'UInt16', 'colorInterpretation': 'Gray', 'min': 0.0, 'max': 255.0, 'minimum': 0.0, 'maximum': 255.0, 'mean': 169.028, 'stdDev': 116.035, 'noDataValue': 65535.0, 'overviews': [{'size': [1000, 1000]}, {'size': [500, 500]}, {'size': [250, 250]}], 'metadata': {'': {'DESCRIPTION': 'croptype', 'STATISTICS_APPROXIMATE': 'YES', 'STATISTICS_MAXIMUM': '255', 'STATISTICS_MEAN': '169.02752', 'STATISTICS_MINIMUM': '0', 'STATISTICS_STDDEV': '116.03492389212', 'STATISTICS_VALID_PERCENT': '100'}}}, {'band': 2, 'block': [256, 256], 'type': 'UInt16', 'colorInterpretation': 'Undefined', 'min': 0.0, 'max': 100.0, 'minimum': 0.0, 'maximum': 100.0, 'mean': 19.885, 'stdDev': 32.63, 'noDataValue': 65535.0, 'overviews': [{'size': [1000, 1000]}, {'size': [500, 500]}, {'size': [250, 250]}], 'metadata': {'': {'DESCRIPTION': 'probability', 'STATISTICS_APPROXIMATE': 'YES', 'STATISTICS_MAXIMUM': '100', 'STATISTICS_MEAN': '19.884752', 'STATISTICS_MINIMUM': '0', 'STATISTICS_STDDEV': '32.629901101574', 'STATISTICS_VALID_PERCENT': '100'}}}], 'stac': {'proj:shape': [2000, 2000], 'proj:wkt2': 'PROJCRS["ETRS89-extended / LAEA Europe",\n    BASEGEOGCRS["ETRS89",\n        ENSEMBLE["European Terrestrial Reference System 1989 ensemble",\n            MEMBER["European Terrestrial Reference Frame 1989"],\n            MEMBER["European Terrestrial Reference Frame 1990"],\n            MEMBER["European Terrestrial Reference Frame 1991"],\n            MEMBER["European Terrestrial Reference Frame 1992"],\n            MEMBER["European Terrestrial Reference Frame 1993"],\n            MEMBER["European Terrestrial Reference Frame 1994"],\n            MEMBER["European Terrestrial Reference Frame 1996"],\n            MEMBER["European Terrestrial Reference Frame 1997"],\n            MEMBER["European Terrestrial Reference Frame 2000"],\n            MEMBER["European Terrestrial Reference Frame 2005"],\n            MEMBER["European Terrestrial Reference Frame 2014"],\n            ELLIPSOID["GRS 1980",6378137,298.257222101,\n                LENGTHUNIT["metre",1]],\n            ENSEMBLEACCURACY[0.1]],\n        PRIMEM["Greenwich",0,\n            ANGLEUNIT["degree",0.0174532925199433]],\n        ID["EPSG",4258]],\n    CONVERSION["Europe Equal Area 2001",\n        METHOD["Lambert Azimuthal Equal Area",\n            ID["EPSG",9820]],\n        PARAMETER["Latitude of natural origin",52,\n            ANGLEUNIT["degree",0.0174532925199433],\n            ID["EPSG",8801]],\n        PARAMETER["Longitude of natural origin",10,\n            ANGLEUNIT["degree",0.0174532925199433],\n            ID["EPSG",8802]],\n        PARAMETER["False easting",4321000,\n            LENGTHUNIT["metre",1],\n            ID["EPSG",8806]],\n        PARAMETER["False northing",3210000,\n            LENGTHUNIT["metre",1],\n            ID["EPSG",8807]]],\n    CS[Cartesian,2],\n        AXIS["northing (Y)",north,\n            ORDER[1],\n            LENGTHUNIT["metre",1]],\n        AXIS["easting (X)",east,\n            ORDER[2],\n            LENGTHUNIT["metre",1]],\n    USAGE[\n        SCOPE["Statistical analysis."],\n        AREA["Europe - European Union (EU) countries and candidates. Europe - onshore and offshore: Albania; Andorra; Austria; Belgium; Bosnia and Herzegovina; Bulgaria; Croatia; Cyprus; Czechia; Denmark; Estonia; Faroe Islands; Finland; France; Germany; Gibraltar; Greece; Hungary; Iceland; Ireland; Italy; Kosovo; Latvia; Liechtenstein; Lithuania; Luxembourg; Malta; Monaco; Montenegro; Netherlands; North Macedonia; Norway including Svalbard and Jan Mayen; Poland; Portugal including Madeira and Azores; Romania; San Marino; Serbia; Slovakia; Slovenia; Spain including Canary Islands; Sweden; Switzerland; Türkiye (Turkey); United Kingdom (UK) including Channel Islands and Isle of Man; Vatican City State."],\n        BBOX[24.6,-35.58,84.73,44.83]],\n    ID["EPSG",3035]]', 'proj:epsg': 3035, 'proj:projjson': {'$schema': 'https://proj.org/schemas/v0.5/projjson.schema.json', 'type': 'ProjectedCRS', 'name': 'ETRS89-extended / LAEA Europe', 'base_crs': {'name': 'ETRS89', 'datum_ensemble': {'name': 'European Terrestrial Reference System 1989 ensemble', 'members': [{'name': 'European Terrestrial Reference Frame 1989', 'id': {'authority': 'EPSG', 'code': 1178}}, {'name': 'European Terrestrial Reference Frame 1990', 'id': {'authority': 'EPSG', 'code': 1179}}, {'name': 'European Terrestrial Reference Frame 1991', 'id': {'authority': 'EPSG', 'code': 1180}}, {'name': 'European Terrestrial Reference Frame 1992', 'id': {'authority': 'EPSG', 'code': 1181}}, {'name': 'European Terrestrial Reference Frame 1993', 'id': {'authority': 'EPSG', 'code': 1182}}, {'name': 'European Terrestrial Reference Frame 1994', 'id': {'authority': 'EPSG', 'code': 1183}}, {'name': 'European Terrestrial Reference Frame 1996', 'id': {'authority': 'EPSG', 'code': 1184}}, {'name': 'European Terrestrial Reference Frame 1997', 'id': {'authority': 'EPSG', 'code': 1185}}, {'name': 'European Terrestrial Reference Frame 2000', 'id': {'authority': 'EPSG', 'code': 1186}}, {'name': 'European Terrestrial Reference Frame 2005', 'id': {'authority': 'EPSG', 'code': 1204}}, {'name': 'European Terrestrial Reference Frame 2014', 'id': {'authority': 'EPSG', 'code': 1206}}], 'ellipsoid': {'name': 'GRS 1980', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257222101}, 'accuracy': '0.1', 'id': {'authority': 'EPSG', 'code': 6258}}, 'coordinate_system': {'subtype': 'ellipsoidal', 'axis': [{'name': 'Geodetic latitude', 'abbreviation': 'Lat', 'direction': 'north', 'unit': 'degree'}, {'name': 'Geodetic longitude', 'abbreviation': 'Lon', 'direction': 'east', 'unit': 'degree'}]}, 'id': {'authority': 'EPSG', 'code': 4258}}, 'conversion': {'name': 'Europe Equal Area 2001', 'method': {'name': 'Lambert Azimuthal Equal Area', 'id': {'authority': 'EPSG', 'code': 9820}}, 'parameters': [{'name': 'Latitude of natural origin', 'value': 52, 'unit': 'degree', 'id': {'authority': 'EPSG', 'code': 8801}}, {'name': 'Longitude of natural origin', 'value': 10, 'unit': 'degree', 'id': {'authority': 'EPSG', 'code': 8802}}, {'name': 'False easting', 'value': 4321000, 'unit': 'metre', 'id': {'authority': 'EPSG', 'code': 8806}}, {'name': 'False northing', 'value': 3210000, 'unit': 'metre', 'id': {'authority': 'EPSG', 'code': 8807}}]}, 'coordinate_system': {'subtype': 'Cartesian', 'axis': [{'name': 'Northing', 'abbreviation': 'Y', 'direction': 'north', 'unit': 'metre'}, {'name': 'Easting', 'abbreviation': 'X', 'direction': 'east', 'unit': 'metre'}]}, 'scope': 'Statistical analysis.', 'area': 'Europe - European Union (EU) countries and candidates. Europe - onshore and offshore: Albania; Andorra; Austria; Belgium; Bosnia and Herzegovina; Bulgaria; Croatia; Cyprus; Czechia; Denmark; Estonia; Faroe Islands; Finland; France; Germany; Gibraltar; Greece; Hungary; Iceland; Ireland; Italy; Kosovo; Latvia; Liechtenstein; Lithuania; Luxembourg; Malta; Monaco; Montenegro; Netherlands; North Macedonia; Norway including Svalbard and Jan Mayen; Poland; Portugal including Madeira and Azores; Romania; San Marino; Serbia; Slovakia; Slovenia; Spain including Canary Islands; Sweden; Switzerland; Türkiye (Turkey); United Kingdom (UK) including Channel Islands and Isle of Man; Vatican City State.', 'bbox': {'south_latitude': 24.6, 'west_longitude': -35.58, 'north_latitude': 84.73, 'east_longitude': 44.83}, 'id': {'authority': 'EPSG', 'code': 3035}}, 'proj:transform': [3540000.0, 10.0, 0.0, 2540000.0, 0.0, -10.0], 'raster:bands': [{'data_type': 'uint16', 'stats': {'minimum': 0.0, 'maximum': 255.0, 'mean': 169.028, 'stddev': 116.035}, 'nodata': None}, {'data_type': 'uint16', 'stats': {'minimum': 0.0, 'maximum': 100.0, 'mean': 19.885, 'stddev': 32.63}, 'nodata': None}], 'eo:bands': [{'name': 'b1', 'description': 'Gray'}, {'name': 'b2', 'description': 'Undefined'}]}}
>>> info['stac']['proj:epsg']
3035
jdries commented 1 year ago

Doesn't seem to work yet in all cases after deployment, it could be that a relative asset path is tried, but should first be resolved to absolute directory.

j-a0d82f65376e4d468c7335438334f5b0

Could not get projection extension metadata, gdal.Info failed for following asset: croptype_openEO_platform_E324N170_2020-01-01Z.tif. Either file does not exist or else it is probably not a raster. Exception from GDAL: croptype_openEO_platform_E324N170_2020-01-01Z.tif: No such file or directory

jdries commented 1 year ago

Found and fixed something on a branch, still to be checked. It seems that for current STAC version, metadata was not yet compliant and most of it was not passed on at all.

We may want to consider that the batch job should store metadata in proper stac format, using versioning to allow different stac versions. This avoids conversion issues later on.

https://github.com/Open-EO/openeo-python-driver/pull/201

jdries commented 1 year ago

tests seem to indicate that it works so closing this one for now