cogeotiff / rio-cogeo

Cloud Optimized GeoTIFF creation and validation plugin for rasterio
https://cogeotiff.github.io/rio-cogeo/
BSD 3-Clause "New" or "Revised" License
308 stars 42 forks source link

GeoTIFF with invalidated optimizations shows up as valid #258

Closed mplough-kobold closed 1 year ago

mplough-kobold commented 1 year ago

Steps to reproduce with rio-cogeo==3.5.1:

Use the files in input and output data.zip or get/generate them as follows.

Obtain lcc-datum.tif from https://download.osgeo.org/geotiff/samples/made_up/.

Generate a slightly invalid COG lcc-datum-cog.tif as follows. Note that while COG does not support creation option TILED, this does run to completion generate a GeoTIFF.

Package versions: dask==2023.4.0 distributed==2023.4.0 rioxarray==0.14.0 xarray==2023.3.0

import rioxarray
import xarray as xr
from dask.distributed import Client, LocalCluster, Lock

cluster = LocalCluster(n_workers=4)
client = Client(cluster)

chunks = {"x": 128, "y": 128}
datum = xr.open_dataarray("lcc-datum.tif", chunks=chunks)
datum.rio.to_raster("lcc-datum-cog.tif", driver="COG", tiled=True, lock=Lock("rio", client=client))

client.close()
cluster.close()

Run rio cogeo validate lcc-datum-cog.tif and view the output:

$ rio cogeo validate lcc-datum-cog.tif
WARNING:rasterio._env:CPLE_AppDefined in lcc-datum-cog.tif: This file used to have optimizations in its layout, but those have been, at least partly, invalidated by later changes
WARNING:rasterio._env:CPLE_AppDefined in /absolute/path/to/lcc-datum-cog.tif: This file used to have optimizations in its layout, but those have been, at least partly, invalidated by later changes
/absolute/path/to/lcc-datum-cog.tif is a valid cloud optimized GeoTIFF

Is this file a valid COG? Based on the warnings it looks like there are issues in the file's layout that rio-cogeo does not detect.

vincentsarago commented 1 year ago

@mplough-kobold COG is a CreateCopy only driver (https://gdal.org/drivers/raster/cog.html#driver-capabilities), when you are using rio.to_raster it will create a COG in Create mode which is theory is not the best way to create COG but rasterio allows it.

If the validate step works I believe it's fine but to be 💯 I wouldn't use the to_raster method to create COG. Sadly rioxarray do not expose the copy method from rasterio