VIDA-NYU / tile2net

Automated mapping of pedestrian networks from aerial imagery tiles
BSD 3-Clause "New" or "Revised" License
146 stars 22 forks source link

oregon 2022 source & update unary_multi #41

Closed dhodcz2 closed 9 months ago

dhodcz2 commented 10 months ago

addressing issue #40

# def unary_multi(gdf):
#   """
#   handles the errors with multipolygon
#   """
#   if gdf.unary_union.type == 'MultiPolygon':
#       gdf_uni = gpd.GeoDataFrame(geometry=gpd.GeoSeries([geom for geom in gdf.unary_union.geoms]))
#   else:
#       gdf_uni = gpd.GeoDataFrame(geometry=gpd.GeoSeries(gdf.unary_union))
#   return gdf_uni

def unary_multi(gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
    # handles the errors with multipolygon
    loc = ~gdf.is_valid.values
    logger.warning(f'Number of invalid geometries: {loc.sum()} out of {len(gdf)}')
    gdf.geometry.loc[loc] = shapely.make_valid(gdf.geometry.loc[loc])
    result = (
        gdf
        # dissolve overlapping geometries
        .dissolve()
        # explode multipart geometries
        .explode()
    )
    return result

I ran it on the proximity which included 5k tiles, with no issues, however testing it on with all 75k tiles as in the issue would likely take several hours of runtime on my end