corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
517 stars 82 forks source link

Writing a large tiff without specifying BIGTIFF="YES" silently fails writing some blocks #709

Open alessioarena opened 11 months ago

alessioarena commented 11 months ago

Code Sample, a copy-pastable example if possible

A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

import xarray as xr
import dask.array as da
import rioxarray as rio

size = (30_000, 60_000)

data = xr.DataArray(
    data = da.random.random(size), 
    coords={'y':np.linspace(0, size[0]*10, size[0]), 'x':np.linspace(0, size[1]*10, size[1])},
    dims=('y', 'x'),
)
data = data.rio.set_crs(3857)

data[::100, ::100].plot()
# you should get something like the image in Expected Output

data.rio.to_raster('test.tif', COMPRESS="DEFLATE")

rio.open_rasterio('test.tif', chunks='auto', parallel=True, lock=False).isel(band=0)[::100, ::100].plot()
# you should get something partial the image in Problem Description

Problem description

I came across this issue recently, and seems it is linked to using COMPRESS="DEFLATE".

If running the code above, saving the image succeeds with no issue or warning raised. However, upon opening the image it looks partial. Untitled

If performing the same exact operation using rasterio, instead I get this error. https://gis.stackexchange.com/questions/368251/error-occurred-while-writing-dirty-block-from-gdalrasterbandirasterio This as the post explains it is linked to not specify BIGTIFF="YES"

Expected Output

Either a correctly saved image, or the error being raised Untitled

Environment Information

Python version : 3.10.12
Platform : Linux
xarray : 2023.10.1
pandas : 2.1.1
dask : 2023.10.0
numpy : 1.23.4
rasterio : 1.3.9
rioxarray : 0.15.0
geopandas : 0.14.0
shapely : 2.0.2
zarr : 2.16.1
matplotlib : 3.8.0
cartopy : 0.22.0
nbic_utils : 2.0.0
xrutils : 2.0.0

Installation method

pypi

snowman2 commented 11 months ago

This is likely due to using a dask array when writing as it uses a different writing mechanism. Do you run into this issue with a numpy array?

pfuhe1 commented 9 months ago

I also have had this issue - the silent failing seems related to using dask

RichardScottOZ commented 7 months ago

I think I have seen with rasterio, too...will just write 4GB worth and rest is empty.

snowman2 commented 7 months ago

I am guessing this is related: https://github.com/corteva/rioxarray/issues/220 See: https://corteva.github.io/rioxarray/latest/examples/dask_read_write.html

snowman2 commented 5 months ago

From: https://gdal.org/drivers/raster/gtiff.html

Default: BIGTIFF=IF_NEEDED Description: "will only create a BigTIFF if it is clearly needed (in the uncompressed case, and image larger than 4GB. So no effect when using a compression)."

In your example, COMPRESS="DEFLATE". So, you need to set BIGTIFF=TES for it to work successfully. In order for a more explicit error message, GDAL is where the change likely would need to happen.