Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

allow setting tiff scale + offset + custom metadata #317

Open jdries opened 1 month ago

jdries commented 1 month ago

Add a format option to set tiff metadata. Some of these are regular tiff tags, others will have to be encoded as gdal band metadata.

The options in geotrellis are limited, but there are very basic tifftools available in linux: https://linux.die.net/man/1/tiffset The idea is to be able to set metadata tags without touching the rest of the file, avoiding a full rewrite. Ideally this is done right after writing the tiff in geotrellis.

Format option will have to be passed through via save_result.

VictorVerhaert commented 1 month ago

example of custom metadata: https://github.com/VITO-RS-Vegetation/lcfm-production/issues/18

jdries commented 1 month ago

For tiffset, we would need to add libtiff-tools to the container.

JorisCod commented 2 weeks ago

The current metadata, visible through gdalinfo, is:

Metadata: PROCESSING_SOFTWARE=0.39.0a1 AREA_OR_POINT=Area

both can stay, but what we are looking for is:

Metadata: AREA_OR_POINT=Area bands=['s2-B02-p10', 's2-B02-p25'] copyright=LCFM project 2020 / Contains modified Copernicus Sentinel data (2020) processed by LCFM consortium creation_time=2024-07-19 00:35:04.693848 license=CC-BY 4.0 - https://creativecommons.org/licenses/by/4.0/ product_crs=EPSG:32629 product_grid=Sentinel-2 UTM tiling grid product_tile=29TNE product_type=LSF monthly median composite for band B02 reference=TODO time_end=2020-02-29T23:59:59Z time_start=2020-02-01T00:00:00Z title=LCFM Monthly Land Surface Features (LSF-MONTHLY) product at 10m resolution for year 2020 version=v002-satio

So this would be something quite flexible. We are setting the metadata through rasterio (python) and passing a dictionary:

        if metadata:
            with rasterio.open(final_vrt_fn, 'r+') as dst:
                dst.update_tags(**metadata)

        if bands_names:
            with rasterio.open(final_vrt_fn, 'r+') as dst:
                for i, b in enumerate(bands_names):
                    dst.set_band_description(i + 1, b)

Additionally, there is a difference in how we set the band description and how OpenEO does it: OpenEO: Band 2 Block=64x64 Type=Int16, ColorInterp=Undefined NoData Value=-32768 Overviews: 1033x1542, 517x771, 259x386, 130x193, 65x97, 33x49 Metadata: DESCRIPTION=B02_P25

We: Band 17 Block=1024x1024 Type=UInt16, ColorInterp=Undefined Description = s2-B08-p25 NoData Value=65535 Offset: 0.0031999999191612, Scale:8.16687767161827e-06

The description in OpenEO is at metadata level, which makes that the band names are not displayed in the symbology in Qgis.

Relatedly, as you see in the output, there's an offset and scale on every band. In principle, the offset and scale can be different per band.

JorisCod commented 1 week ago

Just checking whether this issue is already planned? If it's too extensive, can the scaling and offset be done first and the metadata in a separate issue?

JorisCod commented 1 week ago

We set the scale and offset through gdal_translate: https://gdal.org/en/latest/programs/gdal_translate.html#cmdoption-gdal_translate-a_scale but there are likely other options.

bossie commented 22 hours ago

Regardless of how we put them in the GeoTiff, will the user also provide values for scale/offset in the format options?

bossie commented 22 hours ago

I could get this to work:

tiffset -s 42112 '<GDALMetadata>
  <Item name="PROCESSING_SOFTWARE">0.40.1a1</Item>
  <Item name="DESCRIPTION" sample="0">red</Item>
  <Item name="SCALE" sample="0" role="scale">1.23</Item>
  <Item name="OFFSET" sample="0" role="offset">4.56</Item>
</GDALMetadata>' test_load_stac_datacube_parameters.tif

where: