Closed maxhully closed 3 years ago
Part of the issue is that the overlaps might have topology errors, leading to an error when we try to assign the area to one of the two overlapping geometries (the union operation fails). A .buffer(0)
call could help with that(?).
We could also try approaches based on snapping one geometry's vertices to the other, with some threshold for how far we're willing to move a vertex/how small the gaps should be.
My team is having the same issues with Utah's shape files. Did you or anyone you know come up with a reasonable solution?
@maxhully It turns out that several of the Utah census blocks are properly contained in more than one precinct. It appears that the precinct boundaries were not accurately recorded.
Luckily, all the overlap seems to be in blocks that have no people in them. Removing unpopulated blocks from the dataframe before running maup.assign(blocks, precincts) solved a lot of problems for us. It's not clear yet if that will fix every problem with maup on the Utah data sets.
The absorb_by_shared_perimeter
function performs a union on two geoseries, which will only work if there are holes for all the polygons (i.e. the two geoseries are the same length). In the plots below, inputs are on the left, and outputs are on the right. It works if there are holes for all the polygons (see the two plots in the middle), but if a polygon doesn't get any holes, it gets lost (see the top and bottom plots).
It might work better to iterate through the polygons, perform unions one at a time for those that have holes, and leave the rest alone.
Testing it out on MGGG-State's Utah shapefile had disastrous results. First step is to diagnose what's going on.