Closed Robinlovelace closed 3 years ago
Possible reference code for deciding which buildings in OSM represent residences: https://github.com/dabreegster/abstreet/blob/9f72de94346077ca4a50ac7d10474ccfd50f28e8/map_model/src/make/buildings.rs#L154, by @matkoniecz We should definitely check these rules against the areas of study; some tags are regional.
https://wiki.openstreetmap.org/wiki/Key:building:use may be also worth considering (if used in given area - building
represents not current use but architecture, see https://wiki.openstreetmap.org/wiki/Church for some extreme examples - but that rarely really matters)
List of values for building
tag in OSM: https://taginfo.openstreetmap.org/keys/building#values - note that it is paginated (many are building=yes
If you want to make you local area better and you can use Android I recommend StreetComplete that will ask about missing info, including building types and lane count)
https://wiki.openstreetmap.org/wiki/Key%3Abuilding has list of documented values (some tags with pages may be missed)
If some building values or wiki pages are unclear let me know and there is a decent chance that I can improve it.
You might also look at workplace zones. In areas with lots of non residential land use wpz are much smaller than lsoas and they are classified by the industries they employ. See https://maps.cdrc.ac.uk/#/metrics/industrywp/default/BTTTFFT/10/-0.1582/51.4721/
Good news, I've got basic functionality working to do this now:
remotes::install_github("itsleeds/od")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'od' from a github remote, the SHA1 (27aab841) has not changed since last install.
#> Use `force = TRUE` to force installation
library(od)
od = od::od_data_df
zones = od::od_data_zones_min
subzones = od_data_zones_small
od_disag = od_disaggregate(od, zones, subzones)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
ncol(od_disag) -1 == ncol(od) # same number of columns (except disag data gained geometry)
#> [1] TRUE
sum(od_disag[[3]]) == sum(od[[3]])
#> [1] TRUE
sum(od_disag[[4]]) == sum(od[[4]])
#> [1] TRUE
od_sf = od_to_sf(od, zones)
#> 0 origins with no match in zone ids
#> 0 destinations with no match in zone ids
#> points not in od data removed.
plot(od_data_zones_small$geometry)
plot(od_data_zones_min$geometry, lwd = 3, col = NULL, add = TRUE)
#> Warning in rep(col, length.out = length(x)): 'x' is NULL so the result will be
#> NULL
plot(od_sf["all"], add = TRUE)
plot(od_disag["all"], add = TRUE)
Created on 2021-01-13 by the reprex package (v0.3.0)
Something I'm still a bit confused about is why the resulting 'oneway' versions of the disaggregated lines is different: I thought a give OD pair to be disaggregated can only go from a subzone in the origin zone to a subzone in the destination. This isn't a blocker but still trying to understand the process and how best to implement it (note the approach outlined above also would not work where there are many, many subzones, as is the case when using OSM polygons. Next step: check the results of the latter part of this reproducible example on a per OD pair basis:
remotes::install_github("itsleeds/od")
library(od)
od = od::od_data_df
zones = od::od_data_zones_min
subzones = od_data_zones_small
od_disag = od_disaggregate(od, zones, subzones)
ncol(od_disag) -1 == ncol(od) # same number of columns (except disag data gained geometry)
sum(od_disag[[3]]) == sum(od[[3]])
sum(od_disag[[4]]) == sum(od[[4]])
od_sf = od_to_sf(od, zones)
plot(od_data_zones_small$geometry)
plot(od_data_zones_min$geometry, lwd = 3, col = NULL, add = TRUE)
plot(od_sf["all"], add = TRUE)
plot(od_disag["all"], add = TRUE)
od_disag_oneway = od_oneway(od_disag)
od_sf_oneway = od_oneway(od_sf)
nrow(od_disag) / nrow(od_sf)
nrow(od_disag_oneway) / nrow(od_sf_oneway)
od_disag2 = od_disaggregate(od_sf_oneway %>% sf::st_drop_geometry(), zones, subzones)
od_disag2_oneway = od_oneway(od_disag2)
nrow(od_disag2)
nrow(od_disag2_oneway)
Update on this: it was a bug in the od
package which is now fixed. The result of the test code above now returns the result that was expected:
Heads-up @matkoniecz based partly on your suggestions, but mostly the building types inferred from the link shared by @dabreegster, I have generated a test dataset of buildings in 3 zones in Leeds. I'm putting this tiny example dataset into the 'od' R package for testing as a solid basis for function development that should eventually lead this issue to be fixed. Thanks for the input so far #workinprogress!
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(od)
# od_data_buildings
building_types = c(
"office",
"industrial",
"commercial",
"retail",
"warehouse",
"civic",
"public"
)
leeds_osm = osmextract::oe_get(place = "leeds", layer = "multipolygons")
#> No exact match found for place = leeds and provider = geofabrik. Best match is Laos.
#> Checking the other providers.
#> An exact string match was found using provider = bbbike.
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `multipolygons' from data source `/mnt/57982e2a-2874-4246-a6fe-115c199bc6bd/data/osm/geofabrik_Leeds.gpkg' using driver `GPKG'
#> Simple feature collection with 391926 features and 25 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -1.89 ymin: 53.65 xmax: -1.280002 ymax: 53.88
#> geographic CRS: WGS 84
leeds_osm_buildigs = leeds_osm %>%
filter(building %in% building_types)
zones_of_interest = od_data_zones_min[od_data_zones_min$geo_code %in% c(od_data_df$geo_code1[1:2], od_data_df$geo_code2[1:2]), ]
mapview::mapview(zones_of_interest)
buildings_in_zones = leeds_osm_buildigs[zones_of_interest, , op = sf::st_within]
#> although coordinates are longitude/latitude, st_within assumes that they are planar
mapview::mapview(buildings_in_zones)
Created on 2021-01-14 by the reprex package (v0.3.0)
Demonstration of this in reproducible code below.
Interested to hear feedback on this, especially from @joeytalbot and @dabreegster.
See the resulting geojsons (buildings and disaggregated desire lines) here: https://github.com/cyipt/actdev/tree/main/data-small/great-kneighton
See (and try to reproduce if you get a chance) the code that generated that here: https://github.com/cyipt/actdev/blob/main/code/tests/disaggregate.R
Any comments on the actual function and the documentation very welcome also: https://itsleeds.github.io/od/reference/od_disaggregate.html
So that instead of all going to one centroid they go to a range of places within a zone of interest (e.g. to building polygons from OSM).