corteva / geocube

Tool to convert geopandas vector data into rasterized xarray data.
https://corteva.github.io/geocube
BSD 3-Clause "New" or "Revised" License
339 stars 27 forks source link

Strange behaviour with vectorize #165

Closed matteodefelice closed 6 months ago

matteodefelice commented 8 months ago

test.zip

Today I discovered a strange behaviour when applying vectorize on a NetCDF. As expected, the GeoDataFrame and the DataArray have the same extent. However, when I sort the ycoordinate of the NetCDF using sortby the extents do not match any more. Here an example using the attached test.nc

# %%
import geocube.vector
import rioxarray
# %%
df = rioxarray.open_rasterio("test.nc")
df.rio.write_crs('EPSG:4326', inplace=True)
df = df.sortby("y")
print(df.rio.bounds())
# %%
v = geocube.vector.vectorize(df)
print(v.geometry.total_bounds)
# %%

Output

The extent (the bounds) of the NetCDF is:

(3.358333333333316, 53.550000000000004, 7.233333333333324, 50.75833333333334)

while the vectorized GeoDataFrame is:

[ 3.35833333 47.95        7.23333333 50.75833333]

They should be the same.

Environment Information

geocube v0.5.0

GDAL deps:
         fiona: 1.9.5
   GDAL[fiona]: 3.8.4
      rasterio: 1.3.9
GDAL[rasterio]: 3.8.4

Python deps:
       appdirs: 1.4.4
         click: 8.1.7
     geopandas: 0.14.3
       odc_geo: 0.4.3
     rioxarray: 0.15.1
        pyproj: 3.6.1
        xarray: 2024.2.0

System:
        python: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:42:31) [MSC v.1937 64 bit (AMD64)]
    executable: C:\Users\matte\miniconda3\envs\geocube\python.exe
       machine: Windows-11-10.0.22631-SP0

Conda environment information (if you installed with conda):

Environment (conda list):

``` # Name Version Build Channel affine 2.4.0 pyhd8ed1ab_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge asttokens 2.4.1 pyhd8ed1ab_0 conda-forge attrs 23.2.0 pyh71513ae_0 conda-forge aws-c-auth 0.7.16 hec1de76_6 conda-forge aws-c-cal 0.6.10 hd481e46_1 conda-forge aws-c-common 0.9.13 hcfcfb64_0 conda-forge aws-c-compression 0.2.18 hd481e46_1 conda-forge aws-c-event-stream 0.4.2 h0f06f08_4 conda-forge aws-c-http 0.8.1 hdb5aac5_5 conda-forge aws-c-io 0.14.5 h08270f9_1 conda-forge aws-c-mqtt 0.10.2 hfea8755_4 conda-forge aws-c-s3 0.5.2 h4b2095a_0 conda-forge aws-c-sdkutils 0.1.15 hd481e46_1 conda-forge aws-checksums 0.1.18 hd481e46_1 conda-forge aws-crt-cpp 0.26.2 h8492d2a_7 conda-forge aws-sdk-cpp 1.11.267 h93f5800_1 conda-forge azure-core-cpp 1.11.1 h249a519_1 conda-forge azure-storage-blobs-cpp 12.10.0 h91493d7_1 conda-forge azure-storage-common-cpp 12.5.0 h91493d7_4 conda-forge blosc 1.21.5 hdccc3a2_0 conda-forge branca 0.7.1 pyhd8ed1ab_0 conda-forge brotli 1.1.0 hcfcfb64_1 conda-forge brotli-bin 1.1.0 hcfcfb64_1 conda-forge brotli-python 1.1.0 py312h53d5487_1 conda-forge bzip2 1.0.8 hcfcfb64_5 conda-forge c-ares 1.27.0 hcfcfb64_0 conda-forge ca-certificates 2024.2.2 h56e8100_0 conda-forge cachetools 5.3.3 pyhd8ed1ab_0 conda-forge cairo 1.18.0 h1fef639_0 conda-forge certifi 2024.2.2 pyhd8ed1ab_0 conda-forge cfitsio 4.3.1 h9b0cee5_0 conda-forge charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge click 8.1.7 win_pyh7428d3b_0 conda-forge click-plugins 1.1.1 py_0 conda-forge cligj 0.7.2 pyhd8ed1ab_1 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge comm 0.2.1 pyhd8ed1ab_0 conda-forge contourpy 1.2.0 py312h0d7def4_0 conda-forge cycler 0.12.1 pyhd8ed1ab_0 conda-forge debugpy 1.8.1 py312h53d5487_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge executing 2.0.1 pyhd8ed1ab_0 conda-forge expat 2.5.0 h63175ca_1 conda-forge fiona 1.9.5 py312h95cbb4d_3 conda-forge folium 0.16.0 pyhd8ed1ab_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 h77eed37_1 conda-forge fontconfig 2.14.2 hbde0cde_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.49.0 py312he70551f_0 conda-forge freetype 2.12.1 hdaf720e_2 conda-forge freexl 2.0.0 h8276f4a_0 conda-forge gdal 3.8.4 py312h36e25a9_0 conda-forge geocube 0.5.0 pyhd8ed1ab_0 conda-forge geopandas 0.14.3 pyhd8ed1ab_0 conda-forge geopandas-base 0.14.3 pyha770c72_0 conda-forge geos 3.12.1 h1537add_0 conda-forge geotiff 1.7.1 hbf5ca3a_15 conda-forge hdf4 4.2.15 h5557f11_7 conda-forge hdf5 1.14.3 nompi_h73e8ff5_100 conda-forge icu 73.2 h63175ca_0 conda-forge idna 3.6 pyhd8ed1ab_0 conda-forge importlib-metadata 7.0.1 pyha770c72_0 conda-forge importlib_metadata 7.0.1 hd8ed1ab_0 conda-forge intel-openmp 2024.0.0 h57928b3_49841 conda-forge ipykernel 6.29.3 pyha63f2e9_0 conda-forge ipython 8.22.2 pyh7428d3b_0 conda-forge jedi 0.19.1 pyhd8ed1ab_0 conda-forge jinja2 3.1.3 pyhd8ed1ab_0 conda-forge joblib 1.3.2 pyhd8ed1ab_0 conda-forge jupyter_client 8.6.0 pyhd8ed1ab_0 conda-forge jupyter_core 5.7.1 py312h2e8e312_0 conda-forge kealib 1.5.3 hd248416_0 conda-forge kiwisolver 1.4.5 py312h0d7def4_1 conda-forge krb5 1.21.2 heb0366b_0 conda-forge lcms2 2.16 h67d730c_0 conda-forge lerc 4.0.0 h63175ca_0 conda-forge libabseil 20240116.1 cxx17_h63175ca_2 conda-forge libaec 1.1.2 h63175ca_1 conda-forge libarchive 3.7.2 h313118b_1 conda-forge libblas 3.9.0 21_win64_mkl conda-forge libboost-headers 1.84.0 h57928b3_1 conda-forge libbrotlicommon 1.1.0 hcfcfb64_1 conda-forge libbrotlidec 1.1.0 hcfcfb64_1 conda-forge libbrotlienc 1.1.0 hcfcfb64_1 conda-forge libcblas 3.9.0 21_win64_mkl conda-forge libcrc32c 1.1.2 h0e60522_0 conda-forge libcurl 8.5.0 hd5e4a3a_0 conda-forge libdeflate 1.19 hcfcfb64_0 conda-forge libexpat 2.5.0 h63175ca_1 conda-forge libffi 3.4.2 h8ffe710_5 conda-forge libgdal 3.8.4 h7c2897a_0 conda-forge libglib 2.78.4 h55e6270_1 conda-forge libgoogle-cloud 2.21.0 h2b62511_2 conda-forge libgoogle-cloud-storage 2.21.0 hb581fae_2 conda-forge libgrpc 1.61.1 h864d0f4_1 conda-forge libhwloc 2.9.3 default_haede6df_1009 conda-forge libiconv 1.17 hcfcfb64_2 conda-forge libjpeg-turbo 3.0.0 hcfcfb64_1 conda-forge libkml 1.3.0 haf3e7a6_1018 conda-forge liblapack 3.9.0 21_win64_mkl conda-forge libnetcdf 4.9.2 nompi_h07c049d_113 conda-forge libpng 1.6.43 h19919ed_0 conda-forge libpq 16.2 hdb24f17_0 conda-forge libprotobuf 4.25.2 h503648d_1 conda-forge libre2-11 2023.09.01 hf8d8778_2 conda-forge librttopo 1.1.0 h94c4f80_15 conda-forge libsodium 1.0.18 h8d14728_1 conda-forge libspatialindex 1.9.3 h39d44d4_4 conda-forge libspatialite 5.1.0 hf2f0abc_4 conda-forge libsqlite 3.45.1 hcfcfb64_0 conda-forge libssh2 1.11.0 h7dfc565_0 conda-forge libtiff 4.6.0 h6e2ebb7_2 conda-forge libwebp-base 1.3.2 hcfcfb64_0 conda-forge libxcb 1.15 hcd874cb_0 conda-forge libxml2 2.12.5 hc3477c8_0 conda-forge libzip 1.10.1 h1d365fa_3 conda-forge libzlib 1.2.13 hcfcfb64_5 conda-forge lz4-c 1.9.4 hcfcfb64_0 conda-forge lzo 2.10 he774522_1000 conda-forge m2w64-gcc-libgfortran 5.3.0 6 conda-forge m2w64-gcc-libs 5.3.0 7 conda-forge m2w64-gcc-libs-core 5.3.0 7 conda-forge m2w64-gmp 6.1.0 2 conda-forge m2w64-libwinpthread-git 5.0.0.4634.697f757 2 conda-forge mapclassify 2.6.1 pyhd8ed1ab_0 conda-forge markupsafe 2.1.5 py312he70551f_0 conda-forge matplotlib-base 3.8.3 py312h26ecaf7_0 conda-forge matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge minizip 4.0.4 h5bed578_0 conda-forge mkl 2024.0.0 h66d3029_49657 conda-forge msys2-conda-epoch 20160418 1 conda-forge munkres 1.1.4 pyh9f0ad1d_0 conda-forge nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge networkx 3.2.1 pyhd8ed1ab_0 conda-forge numpy 1.26.4 py312h8753938_0 conda-forge odc-geo 0.4.3 pyhd8ed1ab_0 conda-forge openjpeg 2.5.2 h3d672ee_0 conda-forge openssl 3.2.1 hcfcfb64_0 conda-forge packaging 23.2 pyhd8ed1ab_0 conda-forge pandas 2.2.1 py312h2ab9e98_0 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge pcre2 10.42 h17e33f8_0 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 10.2.0 py312he768995_0 conda-forge pip 24.0 pyhd8ed1ab_0 conda-forge pixman 0.43.4 h63175ca_0 conda-forge platformdirs 4.2.0 pyhd8ed1ab_0 conda-forge poppler 24.02.0 hc2f3c52_0 conda-forge poppler-data 0.4.12 hd8ed1ab_0 conda-forge postgresql 16.2 h1beaf6b_0 conda-forge proj 9.3.1 he13c7e8_0 conda-forge prompt-toolkit 3.0.42 pyha770c72_0 conda-forge psutil 5.9.8 py312he70551f_0 conda-forge pthread-stubs 0.4 hcd874cb_1001 conda-forge pthreads-win32 2.9.1 hfa6e2cd_3 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pygments 2.17.2 pyhd8ed1ab_0 conda-forge pyparsing 3.1.1 pyhd8ed1ab_0 conda-forge pyproj 3.6.1 py312hc725b1e_5 conda-forge pysocks 1.7.1 pyh0701188_6 conda-forge python 3.12.2 h2628c8c_0_cpython conda-forge python-dateutil 2.9.0 pyhd8ed1ab_0 conda-forge python-tzdata 2024.1 pyhd8ed1ab_0 conda-forge python_abi 3.12 4_cp312 conda-forge pytz 2024.1 pyhd8ed1ab_0 conda-forge pywin32 306 py312h53d5487_2 conda-forge pyzmq 25.1.2 py312h1ac6f91_0 conda-forge rasterio 1.3.9 py312hc028deb_2 conda-forge re2 2023.09.01 hd3b24a8_2 conda-forge requests 2.31.0 pyhd8ed1ab_0 conda-forge rioxarray 0.15.1 pyhd8ed1ab_0 conda-forge rtree 1.2.0 py312h72b5f30_0 conda-forge scikit-learn 1.4.1.post1 py312hcacafb1_0 conda-forge scipy 1.12.0 py312h8753938_2 conda-forge setuptools 69.1.1 pyhd8ed1ab_0 conda-forge shapely 2.0.3 py312h7d70906_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.10 hfb803bf_0 conda-forge snuggs 1.4.7 py_0 conda-forge sqlite 3.45.1 hcfcfb64_0 conda-forge stack_data 0.6.2 pyhd8ed1ab_0 conda-forge tbb 2021.11.0 h91493d7_1 conda-forge threadpoolctl 3.3.0 pyhc1e730c_0 conda-forge tiledb 2.20.1 h14acc3a_2 conda-forge tk 8.6.13 h5226925_1 conda-forge tornado 6.4 py312he70551f_0 conda-forge traitlets 5.14.1 pyhd8ed1ab_0 conda-forge typing_extensions 4.10.0 pyha770c72_0 conda-forge tzdata 2024a h0c530f3_0 conda-forge ucrt 10.0.22621.0 h57928b3_0 conda-forge uriparser 0.9.7 h1537add_1 conda-forge urllib3 2.2.1 pyhd8ed1ab_0 conda-forge vc 14.3 hcf57466_18 conda-forge vc14_runtime 14.38.33130 h82b7239_18 conda-forge vs2015_runtime 14.38.33130 hcb4865c_18 conda-forge wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge wheel 0.42.0 pyhd8ed1ab_0 conda-forge win_inet_pton 1.1.0 pyhd8ed1ab_6 conda-forge xarray 2024.2.0 pyhd8ed1ab_0 conda-forge xerces-c 3.2.5 h63175ca_0 conda-forge xorg-libxau 1.0.11 hcd874cb_0 conda-forge xorg-libxdmcp 1.1.3 hcd874cb_0 conda-forge xyzservices 2023.10.1 pyhd8ed1ab_0 conda-forge xz 5.2.6 h8d14728_0 conda-forge zeromq 4.3.5 h63175ca_1 conda-forge zipp 3.17.0 pyhd8ed1ab_0 conda-forge zlib 1.2.13 hcfcfb64_5 conda-forge zstd 1.5.5 h12be248_0 conda-forge ```
snowman2 commented 6 months ago

Deleted last comment as I learned some more after some digging.

Your file has a cached transform:

netcdf test {
dimensions:
    band = 1 ;
    x = 465 ;
    y = 337 ;
variables:
...

    int spatial_ref ;
...
        spatial_ref:GeoTransform = "3.358333333333316 0.008333333333333349 0.0 53.558333333333344 0.0 -0.008333333333333345" ;

Due to this, the bounds & transform will not update correctly after the sortby operation because it will pull the resolution from the cached transform.

To address this, you can update the cached transform (or delete it):

df = df.sortby("y")
df.rio.write_transform(df.rio.transform(recalc=True), inplace=True)

And everything works as expected:

df.rio.bounds()
(3.358333333333316, 50.75, 7.233333333333324, 53.558333333333344)
v = geocube.vector.vectorize(df)
v.geometry.total_bounds
[ 3.35833333, 50.75      ,  7.23333333, 53.55833333]