pct_area_desire_lines function in or out? #7

layik commented 5 years ago
#' Desire lines
#' @export
pct_area_desire_lines = function(area = "sheffield", top_n = 100) {
    stop("area is required.")
  if(length(area) != 1L)
    stop("'package' must be of length 1")
  if( || (area == "") || !is.character(area))
    stop("invalid area name")
  census_file = file.path(tempdir(), "wu03ew_v2.csv")
  if(!exists(census_file)) {
                  file.path(tempdir(), ""))
    unzip(file.path(tempdir(), ""), exdir = tempdir())
  od_all = read_csv(census_file)
  zones = ukboundaries::msoa2011_vsimple[
    grepl(area, ukboundaries::msoa2011_vsimple$msoa11nm,
 = T), ]

  od_area = od_all[od_all$`Area of residence` %in% zones$msoa11cd &
                     od_all$`Area of workplace` %in% zones$msoa11cd, ]
  od_area = od_area[od_area$`Area of residence` !=
                      od_area$`Area of workplace`, ]
  od_area = od_area[order(od_area$`All categories: Method of travel to work`,
                decreasing = TRUE),][1:top_n,]
  area_desire_lines = stplanr::od2line(
    flow = od_area[,c(2,1)], zones[,2])


In base r but requires ukboundaries and stplanr.

Thoughts @Robinlovelace. I can send a PR too.

Robinlovelace commented 5 years ago

I think we should write a function to 🚿 the names.


names_new[3] = "all"
names(d) = names_new
# [1] "area_of_residence"       "area_of_workplace"      
# [3] "all" "work_mainly_at_or_from_home"      
# [5] "underground_metro_light_rail_tram" "train"        
# [7] "bus_minibus_or_coach"    "taxi"         
# [9] "motorcycle_scooter_or_moped"       "driving_a_car_or_van"   
# [11] "passenger_in_a_car_or_van"         "bicycle"      
# [13] "on_foot"       "other_method_of_travel_to_work"
layik commented 5 years ago

I know, I was going...please use Robin's code its already there, but ...what can I say :)

Robinlovelace commented 5 years ago

I like the code. I just dislike the column names that were provided by DfT. Like with stats19 I suggest we impose our own 'good' column names on them at the outset. The earlier we clean the names (e.g. with mode_names_clean() the better.

Robinlovelace commented 5 years ago

Names in the pct:

Robinlovelace commented 5 years ago

We can just use that...

Robinlovelace commented 5 years ago

And, to be fair, just hard-coding them would be fine.

layik commented 5 years ago

Right. Done deal, can we do it without tidyverse? Let me send the PR as I am not sure why it cannot find the new awesome get_centroid function.

layik commented 5 years ago

"faithful to the data" :)

layik commented 5 years ago


get_centroids_ew = function() {
  u = ""
  pwc = readr::read_csv(u)
  sf::st_as_sf(x = pwc[c("X", "Y", "msoa11cd", "msoa11nm")], coords = c("X", "Y"), crs = 4326)
zones_all = get_centroids_ew()
#> Parsed with column specification:
#> cols(
#>   X = col_double(),
#>   Y = col_double(),
#>   objectid = col_double(),
#>   msoa11cd = col_character(),
#>   msoa11nm = col_character()
#> )
#> 2 MB

Robinlovelace commented 5 years ago

I think we just save a subset for Leeds. Keep the pkg data minimal - hence using 10 not 100 desire lines for Leeds.

Robinlovelace commented 5 years ago

Plus we can always create a supporting pctdata pkg.

Robinlovelace commented 5 years ago

Yes. The readr may be the only tidyverse pkg we use.

Robinlovelace commented 5 years ago

Job. Done.