JiaweiZhuang / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
269 stars 49 forks source link

Can xESMF handle categorical data? #62

Open JSAnandEOS opened 4 years ago

JSAnandEOS commented 4 years ago

I have some land cover data which I would like to regrid to a coarser resolution. For each grid cell in my new grid I would only like to have the most popular value in the overlapping original grid cells to be assigned to the new grid. Normally I'd use a GIS program to do this, but the machine I'm on doesn't allow me to install one.

Looking at the website the closest option I can see would be to use "nearest_neighbor" while setting my data to integers. Would this give the result I'm looking for?

JiaweiZhuang commented 4 years ago

The easiest thing is indeed using nearest_neighbor to regrid integer indices. Currently only float64 is used as the dtype of scipy.sparse.coo_matrix, so you need to manually cast the result back to integer. Allowing integer as dtype also seems a reasonable functionality to add.

Alternatively you can convert categorical indices to fractions, using something similar to sklearn OneHotEncoder, and then regrid the fractions (each as a variable) using conservative method.