cyipt / actdev

ActDev - Active travel provision and potential in planned and proposed development sites
https://actdev.cyipt.bike
7 stars 3 forks source link

Generate desire lines for which routes will be generated #10

Closed Robinlovelace closed 3 years ago

Robinlovelace commented 3 years ago

Add each of these raw datasets and the resulting routing dataset as new Github releases.

Robinlovelace commented 3 years ago

Search for: Workplace Zones (WPZ)

mem48 commented 3 years ago

FYI the documentation for journey time statistics outlines where to get at a lot of these

Robinlovelace commented 3 years ago

Do you have a link to this documentation @mem48 ?

joeytalbot commented 3 years ago

I've uploaded an OD file of homes and workplaces at the MSOA level for the 4 Leeds sites. Workplaces are obtained using the MSOA the site lies within. We could use more than one MSOA / OA, but this is a starting point. @si-the-pie

joeytalbot commented 3 years ago

Do you have a link to this documentation @mem48 ?

https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/853603/notes-and-definitions.pdf

See page 10-12 for data sources eg for town centres

Robinlovelace commented 3 years ago

In terms of routing do you have the input data needed to generate routes for Leeds @mvl22 and @si-the-pie ?

Robinlovelace commented 3 years ago

As a starter for 10, here's reproducible code to get routes for the first 9 of the OD pairs provided by @joeytalbot:

# Aim: test routing for actdev project

remotes::install_github("ropensci/stplanr")
remotes::install_github("robinlovelace/cyclestreets")

library(sf)
library(cyclestreets)
u = "https://github.com/cyipt/actdev/releases/download/0.1.1/od-flows-leeds.csv"
od_data = readr::read_csv(u)
l = stplanr::od_coords2line(odc = od_data)
l$length = sf::st_length(l)
summary(l$length)

r1 = journey(from = c(od_data$ox[1], od_data$oy[1]), to = c(od_data$dx[1], od_data$dy[1]))
mapview::mapview(r1["gradient_smooth"])

mapview::mapview(l)
od_linestrings = stplanr::route(l = l[1:9, ], route_fun = journey)
od_linestrings

mapview::mapview(od_linestrings["provisionName"])

Results look good to me:

``` r # Aim: test routing for actdev project remotes::install_github("ropensci/stplanr") #> Using github PAT from envvar GITHUB_PAT #> Skipping install of 'stplanr' from a github remote, the SHA1 (55a470ab) has not changed since last install. #> Use `force = TRUE` to force installation remotes::install_github("robinlovelace/cyclestreets") #> Using github PAT from envvar GITHUB_PAT #> Skipping install of 'cyclestreets' from a github remote, the SHA1 (1f121349) has not changed since last install. #> Use `force = TRUE` to force installation library(sf) #> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0 library(cyclestreets) u = "https://github.com/cyipt/actdev/releases/download/0.1.1/od-flows-leeds.csv" od_data = readr::read_csv(u) #> #> ── Column specification ──────────────────────────────────────────────────────── #> cols( #> ox = col_double(), #> oy = col_double(), #> dx = col_double(), #> dy = col_double() #> ) l = stplanr::od_coords2line(odc = od_data) l$length = sf::st_length(l) summary(l$length) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 519.2 7693.8 11721.8 11407.2 15612.5 19922.9 r1 = journey(from = c(od_data$ox[1], od_data$oy[1]), to = c(od_data$dx[1], od_data$dy[1])) mapview::mapview(r1["gradient_smooth"]) ``` ![](https://i.imgur.com/uYnybXd.png) ``` r mapview::mapview(l) ``` ![](https://i.imgur.com/IBMPbvJ.png) ``` r od_linestrings = stplanr::route(l = l[1:9, ], route_fun = journey) #> Most common output is sf od_linestrings #> Simple feature collection with 250 features and 39 fields #> geometry type: LINESTRING #> dimension: XY #> bbox: xmin: -1.6059 ymin: 53.79356 xmax: -1.32591 ymax: 53.92979 #> geographic CRS: WGS 84 #> First 10 features: #> ox oy dx dy length route_number #> 1 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 2 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 3 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 4 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 5 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 6 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 7 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 8 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 9 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> 10 -1.325985 53.79377 -1.376608 53.92958 15477.78 [m] 1 #> name distances time busynance elevations start_longitude #> 1 Un-named link 19 5 48 NA -1.32591 #> 2 Short un-named link 15 3 27 NA -1.32591 #> 3 Great North Road 2662 863 12672 NA -1.32591 #> 4 Bunkers Hill 1376 264 3831 NA -1.32591 #> 5 Main Street 3228 1075 5891 NA -1.32591 #> 6 NCN National Route 66 119 26 115 NA -1.32591 #> 7 NCN National Route 66 53 11 45 NA -1.32591 #> 8 NCN National Route 66 102 24 98 NA -1.32591 #> 9 Bridge 70 15 66 NA -1.32591 #> 10 NCN National Route 66 65 27 110 NA -1.32591 #> start_latitude finish_longitude finish_latitude crow_fly_distance event #> 1 53.79373 -1.37668 53.92979 15491 depart #> 2 53.79373 -1.37668 53.92979 15491 depart #> 3 53.79373 -1.37668 53.92979 15491 depart #> 4 53.79373 -1.37668 53.92979 15491 depart #> 5 53.79373 -1.37668 53.92979 15491 depart #> 6 53.79373 -1.37668 53.92979 15491 depart #> 7 53.79373 -1.37668 53.92979 15491 depart #> 8 53.79373 -1.37668 53.92979 15491 depart #> 9 53.79373 -1.37668 53.92979 15491 depart #> 10 53.79373 -1.37668 53.92979 15491 depart #> whence speed itinerary clientRouteId plan note length.1 quietness #> 1 1607961776 16 72502447 0 fastest 17406 34 #> 2 1607961776 16 72502447 0 fastest 17406 34 #> 3 1607961776 16 72502447 0 fastest 17406 34 #> 4 1607961776 16 72502447 0 fastest 17406 34 #> 5 1607961776 16 72502447 0 fastest 17406 34 #> 6 1607961776 16 72502447 0 fastest 17406 34 #> 7 1607961776 16 72502447 0 fastest 17406 34 #> 8 1607961776 16 72502447 0 fastest 17406 34 #> 9 1607961776 16 72502447 0 fastest 17406 34 #> 10 1607961776 16 72502447 0 fastest 17406 34 #> west south east north leaving arriving #> 1 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 2 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 3 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 4 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 5 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 6 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 7 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 8 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 9 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> 10 -1.38637 53.79356 -1.32591 53.92979 2020-12-14 16:02:56 2020-12-14 17:23:18 #> grammesCO2saved calories edition gradient_segment elevation_change #> 1 3245 269 routing201124 0.000000000 0 #> 2 3245 269 routing201124 0.000000000 0 #> 3 3245 269 routing201124 0.014650639 39 #> 4 3245 269 routing201124 0.023982558 33 #> 5 3245 269 routing201124 0.013011152 42 #> 6 3245 269 routing201124 0.016806723 2 #> 7 3245 269 routing201124 0.000000000 0 #> 8 3245 269 routing201124 0.009803922 1 #> 9 3245 269 routing201124 0.071428571 5 #> 10 3245 269 routing201124 0.030769231 2 #> provisionName quietness_segment gradient_smooth #> 1 Service Road 0.3958333 0.000000000 #> 2 Service Road 0.5555556 0.000000000 #> 3 Minor road 0.2100694 0.014650639 #> 4 Minor road 0.3591752 0.023982558 #> 5 Minor road 0.5479545 0.013011152 #> 6 Cycle path 1.0347826 0.016806723 #> 7 Cycle path 1.1777778 0.000000000 #> 8 Cycle path 1.0408163 0.009803922 #> 9 Cycle path 1.0606061 0.071428571 #> 10 Cycle path 0.5909091 0.030769231 #> geometry #> 1 LINESTRING (-1.32591 53.793... #> 2 LINESTRING (-1.32614 53.793... #> 3 LINESTRING (-1.32634 53.793... #> 4 LINESTRING (-1.3434 53.8138... #> 5 LINESTRING (-1.3431 53.8261... #> 6 LINESTRING (-1.34757 53.854... #> 7 LINESTRING (-1.3482 53.8554... #> 8 LINESTRING (-1.34823 53.855... #> 9 LINESTRING (-1.34717 53.856... #> 10 LINESTRING (-1.34613 53.856... mapview::mapview(od_linestrings["provisionName"]) ``` ![](https://i.imgur.com/L24p2Wr.png) Created on 2020-12-14 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)
mvl22 commented 3 years ago

In terms of routing do you have the input data needed to generate routes for Leeds @mvl22 and @si-the-pie ?

I think this is pending the POIs stuff?

Robinlovelace commented 3 years ago

I think this is pending the POIs stuff?

No, the data in https://github.com/cyipt/actdev/releases/download/0.1.1/od-flows-leeds.csv is ready to go. Happy to do the routing here in Leeds but thought it may be quicker/easier for CycleStreets to do - by fastest, quietest and balanced profiles if possible.

Robinlovelace commented 3 years ago

Renaming this as it's about deciding which desire lines to route to. Then we can do the routing which in some ways is the easy part.

si-the-pie commented 3 years ago

Origins and Destinations

I've been reading the pdf that Joey sent and looking at the marvellous R code that Robin splashed and decided to have a play with the data.

Loading the data

The csv that Joey kindly provided was processed as follows:

-- PhpMyAdmin was used to import the CSV which generates this table structure:
Create Table: CREATE TABLE `TBL_NAME` (
  `COL 1` varchar(19) DEFAULT NULL,
  `COL 2` varchar(18) DEFAULT NULL,
  `COL 3` varchar(19) DEFAULT NULL,
  `COL 4` varchar(18) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- This is copied to a local table as follows:
drop table if exists originDestination;
create table originDestination (
 id int unsigned not null auto_increment primary key,
 waypoints varchar(255) not null comment "In a format suitable for sending to CycleStreets Journey Planner API"
) engine=myisam;

-- Fill, setting the waypoints field
insert originDestination (waypoints)
select concat(`COL 1`, ',', `COL 2`, '|', `COL 3`, ',', `COL 4`)
  from CSV_DB.TBL_NAME;

Rendering of Origins and Destinations

This rendering of straight lines between the coordinates of each row from looks identical to the second image that Robin produced:

leeds_origins_destinations

This was produced by this unglamourous sql which collates all the geometries into a FeatureCollection:

-- The outer query makes the FeatureCollection from the component geometries as Features
select concat('{"type":"FeatureCollection","features":[',
    group_concat('{"type":"Feature","properties":{"id":', id, '},"geometry":', geojson, '}'),
    ']}') FeatureCollection
  from
 -- The subquery collates the origin destinations from the waypoint strings as geometries and converts to geojson
 (select id, st_asgeojson(st_geomfromtext(concat('linestring(',replace(replace(waypoints,',',' '), '|', ','),')'),0))geojson
    from originDestination)x;

Getting balanced routes

The existing batch routing system in CycleStreets expects a set of points. Routes are generated between each pair of points.

But the data provided here is a origins and destinations, one per row of the table. So the following function was written to process each of the routes in turn and place the result in the table. This is an edited highlight of the actual code:

private function originsDestinations ()
{
    // Routing
    $parameters = array ();
    $parameters['key'] =        $this->settings['internalApikey'];
    $parameters['plans'] =      'balanced';

    // Journey Planner api call
    $journeyPlanUrl = $this->settings['apiV2Url'] . 'journey.plan?' . http_build_query ($parameters);

    // Iterate
    $errors     = 0;
    $updated    = 0;
    foreach ($od_data as $od_datum) {

        // Set the waypoints
        $parameters['waypoints'] = $od_datum['waypoints'];

        // Journey Planner api call url
        $journeyPlanUrl = $this->settings['apiV2Url'] . 'journey.plan?' . http_build_query ($parameters);

        // Get the route
        $geojsonTxt = file_get_contents ($journeyPlanUrl);
        if (!$geojsonTxt) {continue;}

        // Check for error
        if (substr ($geojsonTxt, 0, 9) == '{"error":') {$errors++;}

        // Unpack the route
        if (!$routeData = json_decode ($geojsonTxt, $assoc = true)) {return $this->abandonHtml ("Error decoding the json.");}

        // Save to the database table
        $data =     array ('journey'    => $geojsonTxt);
        $conditions =   array ('id'     => $od_datum['id']);
        $result = $this->databaseConnection->update ($this->settings['database'], $table, $data, $conditions);
        if ($result) {$updated++;}

        // Extract the geometry from the route and add that to another table column
        $routeGeojson   = json_encode ($routeData['features'][2]['geometry']);
        $query = "update {$table} set geometry = st_geomfromgeojson('{$routeGeojson}') where id={$od_datum['id']}";
        $result = $this->databaseConnection->execute ($query);
        if ($result) {$updated++;}
    }
}

Results

Similar unglamourous sql was used to collate the route geometries, into a single geojson file (7.3MB), with the result as follows:

leeds_balanced_routes

Questions arising

The main issue that I have learned is that the batch routing system in CycleStreets has a different expectation about the origins and destinations as in the ACTDEV project.

In ACTDEV (as I understand it), origins and destinations are distinct sets. No route will ever need to be planned between one origin and another origin, and no route will ever be needed betweed two destinations. So for instance there will never be the need to plan a route between a Primary School and a GP surgery.

In the CycleStreets batch routing system, there is essentially no distinction between origins and destinations. They are all provided as a single set of points. Routes are generated between each pair of points. So currently it would plan a route between a Primary School and a GP surgery.

This has not happened in the current example as a table of origins and destinations has been provided, and instead of the batch routing a simple iteration has been performed.

But it is desirable to use the batch routing system because that can scale up. So therefore it needs augmenting with a way of distinguishing origins from destinations.

Robinlovelace commented 3 years ago

Thanks for testing this @si-the-pie. Indeed, the OD data in which the set of destinations is different from the set of origins poses a challenge for batch routing, including Google's Distance Matrix API: https://developers.google.com/maps/documentation/distance-matrix/overview

I suggest that for the purposes of this project we forge ahead using the R interface to cyclestreets, which can calculate in the order of 10 routes per second. One question for @si-the-pie, could you reproduce the code listed here? https://github.com/cyipt/actdev/issues/10#issuecomment-744539260

You may need to install a recent version of R for that to work and set-up an environment variable with an API key, happy to help with that.

In terms of the results, can you share the resulting data file that represents all the generated routes?

In terms of next steps my first impressions are that a good plan to update the batch routing API but I think we can get sufficiently detailed route data for the 4 study areas, and even the yet-to-be-generated list of ~1000 sites using the existing approach for the purposes of this project. This raises other questions about what next steps are and the best use of valuable combined ~150 days of full time work that we have on the project but suggest we save that until after Christmas holidays. Have a great break Simon and catch up in the NY!

si-the-pie commented 3 years ago

@Robinlovelace: Errors appeared when cutting and pasting some of your code into the version of R Studio that we installed during the March 2020 hackathon. It probably needs updating as you suggest, and your help would be very welcome.

The testing has helped get familiar with the data, terminology and some of the issues involved so has been a useful exercise.

The data file is 7.3MB, but only 789K when compressed, and is just a geojson FeaturesCollection. Presumably it should be added to https://github.com/cyipt/actdev/releases/ but how is that done?

Robinlovelace commented 3 years ago

Presumably it should be added to https://github.com/cyipt/actdev/releases/ but how is that done?

You should be able to upload it by editing the release. Even easier is attaching the file to this issue, you can click on the "Attach files" text at the bottom of the issue text box. Will try adding a .zip file here to test...

utils::zip("/home/robin/other-repos/sfnetworks/data/roxel.rda", zipfile = "/tmp/test.zip")

test.zip

Robinlovelace commented 3 years ago

In terms of getting R working, could you try following the instructions and links here @si-the-pie ?

https://itsleeds.github.io/rrsrr/introduction.html#installing-r-and-rstudio

Sure we can get it working!

si-the-pie commented 3 years ago

Ah, yes, of course! Here it is: routes.json.gz

By the way all co-ordinates are rounded to five places of decimals, which is metre-level precision.

Thanks for the instructions for R.

Robinlovelace commented 3 years ago

Heads-up everyone: after good discussion with @joeytalbot I'm working on getting the supermarkets nationally. We can simply use the closest to the site as a destination. Please provide an update on what you're working on Joey - I think we may be able to close this issue by the end of the day based on the updated checklist above.

mvl22 commented 3 years ago

Thanks - it’s been on my todo list for a while to get back to the POIs stuff, but I’ve no doubt you’d solve it more efficiently than I would! The new library seems great - use of extracts is far better than the Overpass API approach.

joeytalbot commented 3 years ago

I'm working at getting a list of town centre locations (using the same dataset as the jts) and routing from sites using these as destinations.

joeytalbot commented 3 years ago

Heads-up everyone: after good discussion with @joeytalbot I'm working on getting the supermarkets nationally. We can simply use the closest to the site as a destination. Please provide an update on what you're working on Joey - I think we may be able to close this issue by the end of the day based on the updated checklist above.

Finding the number of supermarkets within a given distance would give us a better idea of whether a place has lots of shops nearby, or simply one giant supermarket complex.

Robinlovelace commented 3 years ago

Finding the number of supermarkets within a given distance would give us a better idea of whether a place has lots of shops nearby, or simply one giant supermarket complex.

True that - I think as a starter for 10 on the Phase 1 prototype the nearest is good. Worth opening an issue and labeling as a stretch issue like these?

https://github.com/cyipt/actdev/labels/stretch

mvl22 commented 3 years ago

Don't forget about the OSM landuse areas tag:

https://wiki.openstreetmap.org/wiki/Key:landuse

This will probably not have widespread coverage in less good OSM areas though.

Robinlovelace commented 3 years ago

Heads-up @mvl22 and @joeytalbot I've done my part of the issue. The result can be found here: https://github.com/cyipt/actdev/releases/download/0.1.2/supermarket-points-england.geojson

Gratuitous reprex, because reproducibility is important + fun:

u = "https://github.com/cyipt/actdev/releases/download/0.1.2/supermarket-points-england.geojson"
pois = sf::read_sf(u)
mapview::mapview(pois)

Created on 2021-02-02 by the reprex package (v1.0.0)

Robinlovelace commented 3 years ago

Now running for all GB.

Robinlovelace commented 3 years ago

Update: job done on supermarkets. Keep us updated on the towns data @joeytalbot, and look forward to seeing routes to these important destinations!

u = "https://github.com/cyipt/actdev/releases/download/0.1.2/supermarket-points-gb.geojson"
pois = sf::read_sf(u)
mapview::mapview(pois)

Created on 2021-02-02 by the reprex package (v1.0.0)

joeytalbot commented 3 years ago

If that's all of the supermarkets it must be missing a lot. Look at Cumbria, big towns like Penrith have nothing at all.

joeytalbot commented 3 years ago

I've created routes and desire lines from Great Kneighton and Chapelford to their nearest town centre centroid.

eg https://github.com/cyipt/actdev/blob/main/data-small/great-kneighton/route_fast_town.geojson

The town centre dataset is also not perfect, it's from 2004 and in Leeds it includes places like Headingley, Yeadon, Guiseley, Crossgates, Harehills, Armley but misses Horsforth and Chapeltown/Chapel Allerton which should definitely be there.

Robinlovelace commented 3 years ago

If that's all of the supermarkets it must be missing a lot. Look at Cumbria, big towns like Penrith have nothing at all.

Good point. Probably one for Phase II.