xarray-contrib / xvec

Vector data cubes for Xarray
https://xvec.readthedocs.io
MIT License
93 stars 9 forks source link

ENH: support exactextract as a method in zonal_stats #68

Closed masawdah closed 2 months ago

masawdah commented 2 months ago

This is an initial implementation for exactextract.

exactextract expects the statistical input to be an object of a Python class named 'operation' defined within the package. It utilizes a built-in method to construct these objects if the input is a string or callable. However, this built-in method does not support all callables (such as np.mean, ...), and strings like 'quantile' should follow this pattern: 'quantile(q=0.33)'. I've added a check function for the statistical input to use when the method is exactextract, but I'm unsure if this is the best option or if it should simply be added to the documentation.

martinfleis commented 2 months ago

Thanks for this! Regarding the stats, I think it is fine to document the differences when using exactextract. Later we can try to parse it ourselves but I am fine with proper documentation for now.

martinfleis commented 2 months ago

Can you include tests and that documentation?

masawdah commented 2 months ago

Can you include tests and that documentation?

Yes l'll take care of that

masawdah commented 2 months ago

I've just added a test for exactextract. Since exactextract performs statistics based on pixel fractions, the results differ from those of rasterization and iteration. I've maintained the same test functions and data, but added an if-statement specifically for the exactextract method. This introduces only one condition and does not significantly increase the code's complexity.

To avoid complexity, we could create a separate test function for exactextract and use different data. However, I believe this isn't necessary given that we're only adding one condition.

The docs will follow soon :)

martinfleis commented 2 months ago

I am fine with how the way you did tests, keep it.

masawdah commented 2 months ago

Now it should be ready :)

martinfleis commented 2 months ago

I pushed my suggestions directly now.

I am wondering - I know that exactextract only supports DataArray but shall we make an effort and return a Dataset if the input is a Dataset? As shown in the notebook, we now return DataArray, which is inconsistent with the other methods. Can we try to convert it so the input type matches the output type?

masawdah commented 2 months ago

I pushed my suggestions directly now.

I am wondering - I know that exactextract only supports DataArray but shall we make an effort and return a Dataset if the input is a Dataset? As shown in the notebook, we now return DataArray, which is inconsistent with the other methods. Can we try to convert it so the input type matches the output type?

Yes it is better to match the types. I'll add a part to handle it

masawdah commented 2 months ago

I reengineered the code to match the input/output types and modified the pytests accordingly.

martinfleis commented 2 months ago

This is great! One final request, can we pass through attributes as we do with rasterio-based zonal_stats? At that point I think we get the parity and I'd be happy to merge.

Thanks a ton for working on this!