pysal / tobler

Spatial interpolation, Dasymetric Mapping, & Change of Support
https://pysal.org/tobler
BSD 3-Clause "New" or "Revised" License
145 stars 30 forks source link

BUG: changing the index of target_df breaks area_interpolate for categorical variables #149

Closed martinfleis closed 3 years ago

martinfleis commented 3 years ago

When the index of target_df is not the default range index 0:n, area_interpolate for categorical variables is broken. Pandas does alignment in div and misaligns rows.

Fix is on the way.

import geopandas
from libpysal.examples import load_example
from tobler.area_weighted import area_interpolate

sac1 = load_example("Sacramento1")
sac2 = load_example("Sacramento2")
sac1 = geopandas.read_file(sac1.get_path("sacramentot2.shp"))
sac2 = geopandas.read_file(sac2.get_path("SacramentoMSA2.shp"))
categories = ["cat", "dog", "donkey", "wombat", "capybara"]
sac1["animal"] = (categories * ((len(sac1) // len(categories)) + 1))[
    : len(sac1)
]

# changing the index of target breaks area_interpolate for categorical
sac2.index = sac2.index * 13

area = area_interpolate(
        source_df=sac1,
        target_df=sac2,
        categorical_variables=["animal"],
    )
   animal_cat  animal_dog  animal_donkey  animal_wombat  animal_capybara  \
0    0.431909         0.0            0.0       0.000062          0.00063   
1         NaN         NaN            NaN            NaN              NaN   
2         NaN         NaN            NaN            NaN              NaN   
3         NaN         NaN            NaN            NaN              NaN   
4         NaN         NaN            NaN            NaN              NaN   

                                            geometry  
0  POLYGON ((-120.14554 39.22748, -120.14743 39.2...  
1  POLYGON ((-120.37896 39.31638, -120.37917 39.3...  
2  POLYGON ((-120.60887 39.31545, -120.58559 39.3...  
3  POLYGON ((-120.03947 39.23825, -120.03950 39.2...  
4  POLYGON ((-120.65622 39.30815, -120.65456 39.3...