Open sjsrey opened 1 month ago
As the current philosophy in mapclassify is to assume away NANS, geopandas is doing the heavy lifting on dealing with the NANS for choropleths.
I've been exploring some approaches to handling NANS in mapclassify - it isn't as simple as I initially thought, but certainly possible. Doing so fully would require discussions with @martinfleis in order to keep in sync with geopandas.
So this issue is a channel to flesh out the thinking on whether we should do this in mapclassify, or not.
i started looking at swapping in numpy nan_operators (e.g. nanmean instead of nan) to see about making the classifiers agnostic to the NaNs but decided that would probably be more trouble than it's worth. Probably best to let the classifiers operate, conceptually, on 'pure arrays', then just use pandas indices to keep track of where those real observations live, then reinsert on the other side.
The idea would be that if a classifier is given an array with nans, then the resulting y
and yb
attributes would also include nans in the appropriate places, but the classifier would ignore them when assigning bins
if we went that route, I think it would (a) not induce any breaking behavior here in mc and (b) could probably drop-in over at geopandas?)
I'll have to take a dive into our plotting code to get a better understanding of how it could help geopandas. It's been a while since I touched that module.
Originally posted by @knaaptime in https://github.com/pysal/mapclassify/issues/211#issuecomment-2112833967