isciences / exactextract

Fast and accurate raster zonal statistics
Apache License 2.0
255 stars 33 forks source link

Strange behavior in computation of 'frac', 'values', and 'coverage' stats with Python binding #133

Closed LL94eGeos closed 3 weeks ago

LL94eGeos commented 1 month ago

When testing the frac, values, and coverage statistics with Python binding of exaxtextract (version 0.2.0.dev0), I got a strange result.

As expected, for each statistic an array is returned, with shape equal to the number of cells falling within a given geometry, but the entries are all the same, and corresponding to the information of frac/value/coverage for the bottom and rightmost pixel.

I attach here the geometry and the raster that I used for test, which are also shown in the following screenshot from QGIS.

img_1

Below you can find the code (rioxarray version: 0.17.0; geopandas version: 1.0.1).

import rioxarray
import geopandas as gpd
import exactextract

raster = rioxarray.open_rasterio('raster.tif')
gdf = gpd.read_file('parcel.geojson')

output = exactextract.exact_extract(raster, gdf, ['frac', 'coverage', 'values'], output = 'geojson')

The output is the following:

[{'type': 'Feature',
  'properties': {'frac': array([0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178, 0.00092178, 0.00092178, 0.00092178, 0.00092178,
          0.00092178]),
   'values': array([26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26,
          26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26,
          26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26], dtype=int32),
   'coverage': array([0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371,
          0.02431371, 0.02431371, 0.02431371, 0.02431371, 0.02431371])}}]

As visible from the following image, the value "26" is exactly the value of the bottom and rightmost pixel.

img_2
JakubCha commented 1 month ago

parcel.geojson is missing from the attached zip file

LL94eGeos commented 1 month ago

parcel.geojson is missing from the attached zip file

Fixed.

dbaston commented 1 month ago

The results I get are

>>> output
[{'type': 'Feature', 'properties': {'values': array([ -58,  -17,   -6,  -80,  -17,  139,  179,  -53,  -64, -161,  -78,
        138,  171,  -64,  -85, -176,  -71,  131,  163,  -32, -106, -156,
        -78,   36,  182,  -12,  -14, -113, -135,  -26,   68,  -41,  -38,
        -56,  -47,  -32,  128, -103,  -84, -100,  -24,   -3,  144,  -21,
         26], dtype=int32), 'coverage': array([4.16147448e-02, 1.87041685e-01, 1.23196803e-02, 6.96246147e-01,
       9.85401034e-01, 5.87531209e-01, 8.49397331e-02, 3.23682645e-04,
       5.85081637e-01, 1.00000000e+00, 1.00000000e+00, 1.00000000e+00,
       6.18010402e-01, 4.51863140e-01, 1.00000000e+00, 1.00000000e+00,
       1.00000000e+00, 8.15022111e-01, 4.30761576e-02, 2.98847765e-01,
       9.90083456e-01, 1.00000000e+00, 1.00000000e+00, 9.19579089e-01,
       1.25353336e-01, 1.75752923e-01, 9.48460698e-01, 1.00000000e+00,
       1.00000000e+00, 9.81208801e-01, 2.50557840e-01, 8.79403278e-02,
       8.72784257e-01, 1.00000000e+00, 1.00000000e+00, 1.00000000e+00,
       4.18600798e-01, 6.73679188e-02, 9.23885964e-03, 2.42256448e-01,
       8.93045366e-01, 6.05346203e-01, 8.88071663e-05, 3.57572585e-01,
       2.43137069e-02]), 'frac': array([3.38571387e-02, 3.50263676e-04, 3.79120032e-02, 2.55405275e-03,
       3.79120032e-02, 4.75239609e-03, 3.93126537e-02, 4.67063759e-04,
       7.58240064e-02, 1.63310342e-03, 2.34300123e-02, 1.22714575e-05,
       3.79120032e-02, 2.22744851e-02, 1.35562930e-02, 2.63960861e-02,
       3.48630853e-02, 2.29498872e-02, 3.22023543e-03, 3.79120032e-02,
       3.75360471e-02, 4.44496521e-02, 3.79120032e-02, 3.79120032e-02,
       1.57769834e-03, 3.08991209e-02, 4.92419206e-02, 9.21781333e-04,
       3.36685757e-06, 3.30889995e-02, 3.79120032e-02, 3.79120032e-02,
       6.66314537e-03, 9.18442721e-03, 3.33399399e-03, 3.59580450e-02,
       9.49914962e-03, 1.58699948e-02, 3.79120032e-02, 3.79120032e-02,
       3.71995912e-02])}}]

Can you try installing exactextract from git, or from test.pypi.org, e.g. python3 -m pip install -U --index-url https://test.pypi.org/simple/ exactextract --extra-index-url https://pypi.org/simple/ -v ?

LL94eGeos commented 1 month ago

Ok, thanks, the solution pip3 install -U --index-url https://test.pypi.org/simple/ exactextract --extra-index-url https://pypi.org/simple/ -v worked for me.

When I first tried this command, I faced some permission issues, whereas I managed to install the package via pip install exactextract. Today, I added the argument --target and a different directory than the site-packages after pip3 install and it I managed to install the latest version (0.2.0.dev199). Thanks a lot!

dbaston commented 1 month ago

I've just published 0.2.0.dev199 to the regular PyPI.

dbaston commented 3 weeks ago

I can reproduce this locally on latest master when running with NumPy 2.

The problem seems to be in the following method:

https://github.com/isciences/exactextract/blob/1e3ce2b38a638d15fcb3eb3d11485bc3604998d1/python/src/pybindings/feature_bindings.cpp#L116-L125

when using NumPy 2, the statement x[i] = value.data[i] assigns value.data[i] to all indices in x.