mggg / maup

The geospatial toolkit for redistricting data.
https://maup.readthedocs.io/en/latest/
MIT License
67 stars 23 forks source link

Resolve_overlaps complains about differing CRSes despite setting them manually (and other problems) #21

Closed anieuwland closed 3 years ago

anieuwland commented 5 years ago

Thanks for the library! This seems to solve exactly the kind of issues have. Unfortunately I encounter issues. Could you give some pointers on how to use resolve_overlaps?

I have a shapefile of which I am trying to resolve overlaps. When call resolve_overlaps it complains that source and target geometries must have the same CRS. I manually set the CRS of the GeoSeries to epsg:28992 however and I don't know how to set them for the target.

Since the error mentions target and source CRSes being None and EPSG:28992, I also tried manually setting the CRS to None. That resolves the previous issue, but now ends with a NoneType object has no attribute '_geom'.

In summary:

I included code and errors below.

from maup import resolve_overlaps, close_gaps
import geopandas as gpd

gdf = gpd.read_file("/home/workworkwork/Downloads/for_simplification/segmentation_weerribben_largetest_vegetatietypen_redef_sm3_mf5.shp")

print("Resolving self-intersections and removing empty polygons")
polygons = gpd.GeoSeries([pp.buffer(0) for pp in polygons])
polygons = polygons[~polygons.is_empty]
polygons.crs = {'init': 'epsg:28992'} # or set to None
print("Resolving overlaps")
polygons = resolve_overlaps(polygons, relative_threshold=None)

# Crashes

Wrong CRSes:

Traceback (most recent call last):
  File "resolve.py", line 17, in <module>
    polygons = resolve_overlaps(polygons, relative_threshold=None)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 98, in resolve_overlaps
    overlaps, with_overlaps_removed, relative_threshold=None
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 11, in wrapped
    geoms1.crs, geoms2.crs
TypeError: the source and target geometries must have the same CRS. None {'init': 'epsg:28992'}

NoneType has no _geom:

Traceback (most recent call last):
  File "resolve.py", line 17, in <module>
    polygons = resolve_overlaps(polygons, relative_threshold=None)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 98, in resolve_overlaps
    overlaps, with_overlaps_removed, relative_threshold=None
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
    return f(*args, **kwargs)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 117, in absorb_by_shared_perimeter
    assignment = assign_to_max(intersections(sources, targets, area_cutoff=None).length)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
    return f(*args, **kwargs)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 33, in intersections
    reindexed_targets
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 31, in <listcomp>
    (sources.index[j], targets.index[i], geometry)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 45, in enumerate_intersections
    for j, intersection in self.intersections(target).items():
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 24, in intersections
    relevant_geometries = self.query(geometry)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 19, in query
    relevant_indices = [geom.index for geom in self.spatial_index.query(geometry)]
  File "/usr/lib64/python3.7/site-packages/shapely/strtree.py", line 60, in query
    lgeos.GEOSSTRtree_query(self._tree_handle, geom._geom, lgeos.GEOSQueryCallback(callback), None)
AttributeError: 'NoneType' object has no attribute '_geom'
anieuwland commented 5 years ago

After some experimenting I found that the CRS of the geometries passed to resolve_overlaps is lost before it goes to absorb_by_shared_perimeter.

I tried adding overlaps.crs = geometries.crs after line 80 in repair.py (overlaps = inters[inters.area > 0]) to restore it, and indeed the code does not complain about differing CRSes anymore, but now results in the error NoneType object has no attribute '_geom' again.

What could be the cause of this?

anieuwland commented 4 years ago

Apparently some of the geometries maup encounters while working on my shapefile are None. This is odd because I filter out invalid and empty geometries from my GeoSeries of only Polygons:

gdf = gpd.read_file("...")
gdf = gdf.explode()
polygons = gdf.geometry.buffer(0.00005)
polygons = gpd.GeoSeries([pp.buffer(0) for pp in polygons])
polygons = polygons[~polygons.is_empty]
polygons = polygons[polygons.is_valid]

I changed enumerate_intersections to continue if it encounters a NoneType geometry, as below:

def enumerate_intersections(self, targets):
    target_geometries = get_geometries(targets)
    for i, target in target_geometries.items():
        if target is None:
            continue
        for j, intersection in self.intersections(target).items():
            yield i, j, intersection

Which just leads to more problems:

TopologyException: side location conflict at 193055.49995380666 532458.09998086526
Traceback (most recent call last):
  File "resolve.py", line 18, in <module>
    polygons = resolve_overlaps(polygons, relative_threshold=None)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 100, in resolve_overlaps
    overlaps, with_overlaps_removed, relative_threshold=None
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
    return f(*args, **kwargs)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 119, in absorb_by_shared_perimeter
    assignment = assign_to_max(intersections(sources, targets, area_cutoff=None).length)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
    return f(*args, **kwargs)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 33, in intersections
    reindexed_targets
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 31, in <listcomp>
    (sources.index[j], targets.index[i], geometry)
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 48, in enumerate_intersections
    for j, intersection in self.intersections(target).items():
  File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 25, in intersections
    intersections = relevant_geometries.intersection(geometry.buffer(0))
  File "/usr/lib/python3.7/site-packages/geopandas/base.py", line 480, in intersection
    return _binary_geo("intersection", self, other)
  File "/usr/lib/python3.7/site-packages/geopandas/base.py", line 66, in _binary_geo
    geoms, index = _delegate_binary_method(op, this, other)
  File "/usr/lib/python3.7/site-packages/geopandas/base.py", line 57, in _delegate_binary_method
    data = getattr(a_this, op)(other, *args, **kwargs)
  File "/usr/lib/python3.7/site-packages/geopandas/array.py", line 614, in intersection
    return _binary_geo("intersection", self, other)
  File "/usr/lib/python3.7/site-packages/geopandas/array.py", line 233, in _binary_geo
    data[:] = [getattr(s, op)(right) if s and right else None for s in left.data]
  File "/usr/lib/python3.7/site-packages/geopandas/array.py", line 233, in <listcomp>
    data[:] = [getattr(s, op)(right) if s and right else None for s in left.data]
  File "/usr/lib64/python3.7/site-packages/shapely/geometry/base.py", line 620, in intersection
    return geom_factory(self.impl['intersection'](self, other))
  File "/usr/lib64/python3.7/site-packages/shapely/topology.py", line 70, in __call__
    self._check_topology(err, this, other)
  File "/usr/lib64/python3.7/site-packages/shapely/topology.py", line 39, in _check_topology
    raise err
shapely.errors.TopologicalError: This operation could not be performed. Reason: unknown
jswise commented 4 years ago

@Nimmerwoner , is your shapefile publicly available?

anieuwland commented 4 years ago

Thanks for responding @jswise. The shapefile is the following: vectorized_raster.zip. The script I wrote to experiment is included in full below.

The shapefile is result of vectorizing a raster to polygons. Subsequently we want to smooth the edges, but that introduces overlaps and gaps. It is for this reason we use maup.

The explicit removals of empty or invalid polygons is there to ensure this is not the cause of resolve_overlaps crashing. I am also buffering the shapes to fix self-intersecting lines.

Any advice would be appreciated!

from sys import argv
from maup import resolve_overlaps, close_gaps
import geopandas as gpd

print("Reading")
path = argv[1] if len(argv) > 1 else "./vectorized_raster.shp"
gdf = gpd.read_file(path)
gdf.crs = {'init': 'epsg:28992'}
print("Exploding multipolygons into polygons")
gdf = gdf.explode()

# helps against self-intersecting lines and Nones of which shapely cannot make polygons
polygons = gpd.GeoSeries([p.buffer(0.00005) for p in gdf.geometry])
print("Simplifying polygons")
polygons = polygons.simplify(1.0) # what we're actually trying to achieve and why we need to resolve overlaps and close gaps
print("Resolving self-intersections")
polygons = gpd.GeoSeries([pp.buffer(0) for pp in polygons]) # polygons.buffer(0) is not a sufficient replacement

polygons = polygons[~polygons.is_empty]
polygons = polygons[polygons.is_valid]
[print("Invalid:", p) for p in polygons if not p.is_valid]
[print("Empty:", p) for p in polygons if p.is_empty]

polygons.crs = None #{'init': 'epsg:28992'} # None
print("Resolving overlaps")
polygons = resolve_overlaps(polygons, relative_threshold=None)
print("Closing gaps")
polygons = close_gaps(polygons, relative_threshold=None)
gdf.geometry = polygons

gdf.to_file("resolved.shp")
nickeubank commented 4 years ago

Any progress on this? Running into the same problem...

Code:

precincts_no_overlaps = maup.resolve_overlaps(precincts)

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-d540d10605d8> in <module>
----> 1 precincts_no_overlaps = maup.resolve_overlaps(precincts)

~/miniconda3/lib/python3.7/site-packages/maup/repair.py in resolve_overlaps(geometries, relative_threshold)
     96 
     97     return absorb_by_shared_perimeter(
---> 98         overlaps, with_overlaps_removed, relative_threshold=None
     99     )
    100 

~/miniconda3/lib/python3.7/site-packages/maup/crs.py in wrapped(*args, **kwargs)
      9             raise TypeError(
     10                 "the source and target geometries must have the same CRS. {} {}".format(
---> 11                     geoms1.crs, geoms2.crs
     12                 )
     13             )

TypeError: the source and target geometries must have the same CRS. None EPSG:3085

(the geodataframe is projection EPSG:3085)

And none empty:

image image

nickeubank commented 4 years ago

My shapefile if it's helpful nick_precincts.zip

InnovativeInventor commented 3 years ago

@nickeubank @anieuwland Once #26 is merged in, you should be able to do

precincts["geometry"] = maup.resolve_overlaps(precincts)

or, once #27 is merged in you can do

precincts["geometry"] = maup.autorepair(precincts)

which should also take care of any topology errors you encounter. I've tested it on both your shapefiles and it works. If you're feeling a bit impatient (after over a year or so), feel free to clone my repo and it should work as-is. Let me know how it goes!

nickeubank commented 3 years ago

AMAZING! Thanks so much!

InnovativeInventor commented 3 years ago

@anieuwland let me know if the most recent PRs resolved your issues (particularly the ones that @pjrule, @tylerjarvis, and I merged in). I added some end-to-end tests and fixed the failing tests, so everything should be good to go. If not, let me know, and I'll re-open this!

anieuwland commented 3 years ago

Hi @InnovativeInventor, I apologize for not responding. It has been a busy period for me and I've left the team working on this. I will be sure to forward this to my colleagues however. Thank you very much for your work.