Open matthew-law opened 3 years ago
Thanks for the report!
momepy.CellAlignment expects the left and right GeoDataFrames (containing buildings and tessellation cells) to have the same number of elements, ie each building corresponds to one cell (which is the case in morphological tessellation).
Actually, it doesn't. In the example below, there are more enclosed tessellation cells than buildings and it works fine.
import geopandas
import momepy
blg = geopandas.read_file(momepy.datasets.get_path('bubenec'), layer="buildings")
streets = geopandas.read_file(momepy.datasets.get_path('bubenec'), layer="streets")
enclosures = momepy.enclosures(streets, limit=geopandas.GeoSeries([streets.unary_union.convex_hull]))
tessellation = momepy.Tessellation(blg, "uID", enclosures=enclosures).tessellation
blg['orient_blg'] = momepy.Orientation(blg).series
tessellation['orient_tess'] = momepy.Orientation(tessellation).series
momepy.CellAlignment(blg, tessellation, 'orient_blg', 'orient_tess', 'uID', 'uID').series
0 0.290739
1 0.547136
2 5.486106
3 0.263380
4 0.165063
...
139 0.185136
140 17.566047
141 0.386790
142 0.010713
143 0.438541
Length: 144, dtype: float64
print(blg.shape, tessellation.shape)
(144, 3) (153, 4)
So the issue is not in the non-matching number of elements.
I can reproduce the issue if I have duplicated unique IDs in the tessellation layer. That can happen when a building intersects two enclosures. It is generally not safe to use a building unique ID as a unique identifier of enclosures.
One way around this I was using is to combine both geometries (buildings and tessellation) into the same GeoDataFrame and then dynamically change the active geometry.
# renaming to keep track of which is which and merging together
combined = tessellation.rename_geometry("tessellation").merge(blg.rename_geometry("buildings"), on="uID", how="left")
combined["unique_id"] = range(len(combined)) # new truly unique ID
# switching active geometry from default tessellation
momepy.CellAlignment(combined.set_geometry("buildings"), combined, 'orient_blg', 'orient_tess', 'unique_id', 'unique_id').series
I am not sure if this is a bug or not though :D.
Thanks for having a look at this and hopefully getting to the bottom of the issue!
Would it be a bad idea to do a geometric overlay of streets (ie the edges of enclosures) and buildings as a preprocessing step? It should solve the issue of duplicate unique IDs, but it would mess up some of the metrics based on (eg) building area etc (by splitting buildings which overlap with streets). But I would assume that it is a relatively small number of buildings that cross a road (especially since my streets
data only contains roads for cars, and not pedestrian-only streets), so hopefully this wouldn't have a significant effect on the overall results. I did try to find which buildings crossed which streets in my data, but crashed QGIS in the attempt so have yet to do so.
Alternatively, could there be some way to, for each building, get one of either a) the eID
of the enclosure that the building is entirely within (this will be the case for most buildings), or b) the eID
of the enclosure that the largest portion of the building is in, if it is split between two buildings? This could then be used in place of the uID
s that were previously used to match buildings and tessellation cells when running CellAlignment
.
I don't think there's a universal solution for this situation. You can do overlay if you know that you're not going to use that building geometry for anything meaningful in the future. You can do what I've done above but that causes trouble in AreaRatio as you may have a larger building than its cell (in that case, an overlay is the best solution).
I'd say that you need to know your data and your use case and apply a solution that works the best for that.
a) the eID of the enclosure that the building is entirely within (this will be the case for most buildings)
that is geopandas.sjoin
with op="within"
.
b) the eID of the enclosure that the largest portion of the building is in, if it is split between two buildings?
that is tobler
's area_join
Is this issue the same as #252?
@jGaboardi it seems so.
momepy.CellAlignment
expects the left and right GeoDataFrames (containing buildings and tessellation cells) to have the same number of elements, ie each building corresponds to one cell (which is the case in morphological tessellation). This means that it sometimes doesn't work when using enclosed tessellation, when there are some tessellation cells with no corresponding buildings (ie areas enclosed by roads with no buildings on them).Code example:
Where
tess
is an enclosed tessellation generated from the buildingsblg
.