luukvdmeer / sfnetworks

Tidy Geospatial Networks in R
https://luukvdmeer.github.io/sfnetworks/
Other
342 stars 20 forks source link

A 'play_osm' function to create osm street networks given a bbox or placename #79

Open luukvdmeer opened 3 years ago

luukvdmeer commented 3 years ago

Is your feature request related to a problem? Please describe. Tidygraph has a set of play_* functions to create graphs from scratch according to some defined rules/algorithms. For example play_node_degree will create a graph based on a given set of node degrees. See the reference.

Describe the solution you'd like I am thinking of a function play_osm that allows users to create a graph based on a street network from OpenStreetMap. Inputs to this function will be forwarded directly to the osmdata loading function, so it could be a bounding box or a placename. No cleaning needs to be done (in the end, the function is just meant to "play" with an example). I suggest to not use osmdata as a dependency of the package, but just include an if(!requireNamespace) line at the beginning of the function.

Robinlovelace commented 3 years ago

Great idea. It would be good to give the user some control over which elements are downloaded, e.g. to exclude footpaths and railway tracks if doing vehicle routing.

Links that may be of use here:

idshklein commented 3 years ago

Great idea. This is very similar to python OSMnx graph from_*. It is preferable that this would be more customizable than OSMnx network types, that are hardcoded in OSMnx code. In addition, be advised the the current OSMdata package doesn't support using the overpass API (poly:...) selection, rather only bbox, to the best of my knowledge,

psychemedia commented 3 years ago

Are there any example workflows or vignettes that document the current best practice way of using ropensci/osmdata with sfnetworks?

loreabad6 commented 3 years ago

Hi @psychemedia! Maybe @agila5 might have some good practice examples on how to use osmdata. In the meantime you can take a look at this example.

agila5 commented 3 years ago

Hi @psychemedia, I'm sorry, but I totally missed this github issue. Anyway, do you have any particular question regarding the integration of osmdata and sfnetworks?

In my opinion, the most important pre-processing step for analysing OSM data with sfnetworks is the to_spatial_subdivision spatial morpher, since it adds any node that may be missing when extracting the boundary points of road segments downloaded from OSM. This is the first morpher that should be applied when working with OSM data. See the introductory vignette and Chapter 5 of this preprint for more details, taking into account that the stplanr approach and the sfnetworks approach are very similar. Another relevant morpher is to_components, which is used to select only the edges that belong to the main component of a road network (see Chapters 3, 4, and 6 of this preprint for more details). It excludes the small clusters of road segments that are connected to the "main group" by other edges not included in the network. Other spatial morphers are described in the introductory vignette. To give an example:

# packages
library(sfnetworks)
library(tidygraph)
#> 
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright

# download OSM road network data in Luxemburg
osm_luxemburg <- opq("Luxemburg") %>% 
  add_osm_feature(key = "highway") %>% 
  osmdata_sf()

# Create the sfnetwork object
(sfn_luxemburg <- as_sfnetwork(osm_luxemburg$osm_lines, directed = FALSE))
#> # A sfnetwork with 20218 nodes and 15780 edges
#> #
#> # CRS:  EPSG:4326 
#> #
#> # An undirected multigraph with 5449 components with spatially explicit edges
#> #
#> # Node Data:     20,218 x 1 (active)
#> # Geometry type: POINT
#> # Dimension:     XY
#> # Bounding box:  xmin: 6.059071 ymin: 49.53528 xmax: 6.237185 ymax: 49.66538
#>              geometry
#>           <POINT [°]>
#> 1 (6.127754 49.57735)
#> 2 (6.126039 49.57823)
#> 3  (6.124353 49.5784)
#> 4 (6.121852 49.57781)
#> 5 (6.083609 49.62056)
#> 6 (6.082905 49.62049)
#> # ... with 20,212 more rows
#> #
#> # Edge Data:     15,780 x 286
#> # Geometry type: LINESTRING
#> # Dimension:     XY
#> # Bounding box:  xmin: 6.059071 ymin: 49.53528 xmax: 6.241737 ymax: 49.66538
#>    from    to osm_id name  abbreviation access access.conditio~ access.lanes
#>   <int> <int> <chr>  <chr> <chr>        <chr>  <chr>            <chr>       
#> 1     1     2 40159~ <NA>  <NA>         <NA>   <NA>             <NA>        
#> 2     3     4 40159~ <NA>  <NA>         <NA>   <NA>             <NA>        
#> 3     5     6 40159~ <NA>  <NA>         <NA>   <NA>             <NA>        
#> # ... with 15,777 more rows, and 278 more variables:
#> #   access.lanes.backward <chr>, addr.city <chr>, addr.country <chr>, ...

# Filter the primary, secondary, and tertiary roads (with the corresponding links)
sfn_luxemburg_small <- sfn_luxemburg %>% 
  activate("edges") %>% 
  filter(grepl("primary|secondary|tertiary", highway))

# Plot result
par(mar = rep(0, 4))
plot(sfn_luxemburg_small)

# It's quite clear that there are hundreds of isolated nodes. 

# Apply the morphers: 
sfn_luxemburg_tidy <- sfn_luxemburg_small %>% 
  convert(to_spatial_subdivision, .clean = TRUE) %>% 
  convert(to_components, .select = 1L)
#> Warning: to_spatial_subdivision assumes attributes are constant over geometries

par(mar = rep(0, 4))
plot(sfn_luxemburg_tidy)

Created on 2021-02-15 by the reprex package (v0.3.0)

If you need larger OSM (road or rivers) networks, I'm working on an R package named osmextract for downloading bulk OSM data from external providers. For example:

# packages
library(sfnetworks)
library(tidygraph)
#> 
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(osmextract)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright.
#> Check the package website, https://docs.ropensci.org/osmextract/, for more details.

# download OSM road network data in Luxemburg
luxemburg <- oe_get(
  place = "Luxemburg", 
  query = "
  SELECT * 
  FROM 'lines' 
  WHERE highway IN (
  'primary', 'primary_link', 'secondary', 'secondary_link', 'tertiary', 'tertiary_link'
  )"
)
#> The input place was matched with: Luxembourg
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `lines' from data source `C:\Users\Utente\Documents\osm_data\geofabrik_luxembourg-latest.gpkg' using driver `GPKG'
#> Simple feature collection with 13140 features and 9 fields
#> geometry type:  LINESTRING
#> dimension:      XY
#> bbox:           xmin: 5.737261 ymin: 49.44548 xmax: 6.531682 ymax: 50.18435
#> geographic CRS: WGS 84

# Create and plot the sfnetwork object
(sfn_luxemburg <- as_sfnetwork(luxemburg, directed = FALSE))
#> # A sfnetwork with 11988 nodes and 13140 edges
#> #
#> # CRS:  WGS 84 
#> #
#> # An undirected multigraph with 183 components with spatially explicit edges
#> #
#> # Node Data:     11,988 x 1 (active)
#> # Geometry type: POINT
#> # Dimension:     XY
#> # Bounding box:  xmin: 5.737261 ymin: 49.44548 xmax: 6.525528 ymax: 50.18435
#>              geometry
#>           <POINT [°]>
#> 1 (6.130018 49.65902)
#> 2  (6.12952 49.65888)
#> 3 (6.054492 49.62214)
#> 4 (6.052075 49.62211)
#> 5  (6.13038 49.61462)
#> 6 (6.129808 49.61448)
#> # ... with 11,982 more rows
#> #
#> # Edge Data:     13,140 x 12
#> # Geometry type: LINESTRING
#> # Dimension:     XY
#> # Bounding box:  xmin: 5.737261 ymin: 49.44548 xmax: 6.531682 ymax: 50.18435
#>    from    to osm_id name  highway waterway aerialway barrier man_made z_order
#>   <int> <int> <fct>  <fct> <fct>   <fct>    <fct>     <fct>   <fct>      <int>
#> 1     1     2 40655~ Rout~ primary <NA>     <NA>      <NA>    <NA>          27
#> 2     3     4 40657~ Rout~ primary <NA>     <NA>      <NA>    <NA>           7
#> 3     5     6 40657~ Boul~ primary <NA>     <NA>      <NA>    <NA>           7
#> # ... with 13,137 more rows, and 2 more variables: other_tags <fct>,
#> #   geometry <LINESTRING [°]>

par(mar = rep(0, 4))
plot(sfn_luxemburg, cex = 0.8)

Created on 2021-02-15 by the reprex package (v0.3.0)

Now you should apply the same morphers as before, but it really depends on the objective of your analysis. The differences between the two plots are simply because the two packages use different bounding boxes when downloading OSM data. If you are interested in osmextract, check the README and the introductory vignette. In any case, if you have any further question, feel free to ask (maybe creating a new github issue or a new discussion).

psychemedia commented 3 years ago

Hi @agila5, thanks for that... I've tinkered with osmnx before and was just wondering whether there was a best current practice workflow for getting data retrieved from Overpass using osmdata into the sfnetworks representation. I'll have a play with the recipes you've suggested. Thanks :-)

agila5 commented 3 years ago

As long as you create an sf object representing the spatial structure of the network, I think you can safely apply the approach described before. Maybe we could create an ad-hoc method in as_sfnetwork for osmdata_sf objects, but I think that's not a priority.

luukvdmeer commented 3 years ago

@psychemedia Thanks for raising this! I think either a vignette or blogpost presenting such a workflow (starting with querying data from OSM and ending with a clean network) would be very nice indeed. Currently we are not as mature as osmnx (yet..?) and also not solely focusing on OSM data. Sharing how your experiments go - what worked nice and what did not - would be of great help for moving us in the right direction!

psychemedia commented 3 years ago

@luukvdmeer Sure.. will post my novice attempts when I've done them (tomorrow hopefully...) My use case is perhaps a bit niche, and I'm not sure how far I'll get (network diagrams of rally stage routes as a first step to generating tulip diagrams), but happy to share whatever I end up with!

psychemedia commented 3 years ago

Sketch notes here: https://gist.github.com/psychemedia/ddd95de9a3fbc3c1afae85a8a7a431d8

I was new to R geo stuff a couple of weeks ago so some of the code is likely hacky workarounds because: a) my R is several years out of practice; b) I'm still trying to figure out how to do the most elementary geo things!

loreabad6 commented 3 years ago

Hi @psychemedia thank you for sharing your work, I enjoyed going through it a lot! I commented on your gist with a possible solution to your problem, hopefully it helps you go forward! :) The analysis so far looks very interesting and I am really excited to see where it ends up!

psychemedia commented 3 years ago

@loreabad6 Thanks for that. I commented over there. TLDR is I'd like s/thing like the Python osmnx truncate_by_edge parameter that can "retain nodes outside bounding box if at least one of node’s neighbors is within the bounding box". by the by, the gist recipe is now also part of the more general RallyDataJunkie/visualising-rally-stages.