When the index of target_df is not the default range index 0:n, area_interpolate for categorical variables is broken. Pandas does alignment in div and misaligns rows.
Fix is on the way.
import geopandas
from libpysal.examples import load_example
from tobler.area_weighted import area_interpolate
sac1 = load_example("Sacramento1")
sac2 = load_example("Sacramento2")
sac1 = geopandas.read_file(sac1.get_path("sacramentot2.shp"))
sac2 = geopandas.read_file(sac2.get_path("SacramentoMSA2.shp"))
categories = ["cat", "dog", "donkey", "wombat", "capybara"]
sac1["animal"] = (categories * ((len(sac1) // len(categories)) + 1))[
: len(sac1)
]
# changing the index of target breaks area_interpolate for categorical
sac2.index = sac2.index * 13
area = area_interpolate(
source_df=sac1,
target_df=sac2,
categorical_variables=["animal"],
)
animal_cat animal_dog animal_donkey animal_wombat animal_capybara \
0 0.431909 0.0 0.0 0.000062 0.00063
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
geometry
0 POLYGON ((-120.14554 39.22748, -120.14743 39.2...
1 POLYGON ((-120.37896 39.31638, -120.37917 39.3...
2 POLYGON ((-120.60887 39.31545, -120.58559 39.3...
3 POLYGON ((-120.03947 39.23825, -120.03950 39.2...
4 POLYGON ((-120.65622 39.30815, -120.65456 39.3...
When the index of target_df is not the default range index 0:n,
area_interpolate
for categorical variables is broken. Pandas does alignment indiv
and misaligns rows.Fix is on the way.