pysal / tobler

Spatial interpolation, Dasymetric Mapping, & Change of Support
https://pysal.org/tobler
BSD 3-Clause "New" or "Revised" License
144 stars 30 forks source link

ENH: support categorical variables in area_interpolate #135

Closed martinfleis closed 3 years ago

martinfleis commented 3 years ago

Hi,

we needed to transfer categorical data (land cover) so I used the existing area_interpolate and added support for categorical variables. It measures a ratio of each unique category present in each polygon. See the example below.

sac1 = load_example("Sacramento1")
sac2 = load_example("Sacramento2")
sac1 = geopandas.read_file(sac1.get_path("sacramentot2.shp"))
sac2 = geopandas.read_file(sac2.get_path("SacramentoMSA2.shp"))
categories = ["cat", "dog", "donkey", "wombat", "capybara"]
sac1["animal"] = (categories * ((len(sac1) // len(categories)) + 1))[
    : len(sac1)
]

res = area_interpolate(
        source_df=sac1,
        target_df=sac2,
        categorical_variables=["animal"],
    )
>>> print(res.head())

   animal_cat  animal_dog  animal_donkey  animal_wombat  animal_capybara  \
0    0.431909    0.000000       0.000000       0.000062         0.000630   
1    0.069708    0.000000       0.000000       0.000000         0.000000   
2    0.630183    0.000000       0.000000       0.000000         0.354106   
3    0.462047    0.378258       0.158367       0.000597         0.000000   
4    0.992120    0.000000       0.000000       0.000000         0.006820   

                                            geometry  
0  POLYGON ((-120.14554 39.22748, -120.14743 39.2...  
1  POLYGON ((-120.37896 39.31638, -120.37917 39.3...  
2  POLYGON ((-120.60887 39.31545, -120.58559 39.3...  
3  POLYGON ((-120.03947 39.23825, -120.03950 39.2...  
4  POLYGON ((-120.65622 39.30815, -120.65456 39.3... 

@darribas and I thought it would be good to add it to tobler. Since it uses a lot of existing machinery, I just added a keyword to area_interpolate but can turn it into an independent function if that is preferable.