UrbanAnalyst / gtfsrouter

Routing and analysis engine for GTFS (General Transit Feed Specification) data
https://urbananalyst.github.io/gtfsrouter/
81 stars 17 forks source link

Unable to read gtfs nor reproduce an example #24

Closed stmarcin closed 4 years ago

stmarcin commented 4 years ago

Hello, Unfortunately I am not able to reproduce an example from vignette:

library(gtfsrouter)
berlin_gtfs_to_zip()
#> Warning in system2(zip, args, input = input, invisible = TRUE): '"zip"' not
#> found
tempfiles <- list.files (tempdir (), full.names = TRUE)
filename <- tempfiles [grep ("vbb.zip", tempfiles)]
filename
#> character(0)

Created on 2020-04-16 by the reprex package (v0.3.0)

I am not able to read this gtfs feed :

library(gtfsrouter)
download.file("https://mkuran.pl/feed/ztm/ztm-latest.zip", "gtfs.zip")
gtfs <- extract_gtfs("gtfs.zip")
#> Warning: This feed contains no transfers.txt
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p gtfs.zip "calendar_dates.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c64af4fb9' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c64af4fb9' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p gtfs.zip "attributions.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c4e9d1e4e' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c4e9d1e4e' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p gtfs.zip "frequencies.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c46701221' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c46701221' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else
#> shell)(paste0("(", : '(unzip -p gtfs.zip "shapes.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c53ae785f' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c53ae785f' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else
#> shell)(paste0("(", : '(unzip -p gtfs.zip "agency.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c20bc155a' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c20bc155a' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else
#> shell)(paste0("(", : '(unzip -p gtfs.zip "trips.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c676e6bd8' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c676e6bd8' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else
#> shell)(paste0("(", : '(unzip -p gtfs.zip "stops.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c6f452fd8' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c6f452fd8' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p gtfs.zip "feed_info.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c12434871' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c12434871' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else
#> shell)(paste0("(", : '(unzip -p gtfs.zip "routes.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c25c321b9' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c25c321b9' has size 0.
#> Returning a NULL data.table.
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p gtfs.zip "stop_times.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c4eb956d6' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c4eb956d6' has size 0.
#> Returning a NULL data.table.
#> Error in extract_gtfs("gtfs.zip"): gtfs.zip does not appear to be a GTFS file; it must minimally contain
#>   routes, stops, stop_times, trips

Created on 2020-04-16 by the reprex package (v0.3.0)

I attach my systemInfo:

sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18362)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United Kingdom.1252 
#> [2] LC_CTYPE=English_United Kingdom.1252   
#> [3] LC_MONETARY=English_United Kingdom.1252
#> [4] LC_NUMERIC=C                           
#> [5] LC_TIME=English_United Kingdom.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.6.3   magrittr_1.5     tools_3.6.3      htmltools_0.4.0 
#>  [5] yaml_2.2.1       Rcpp_1.0.4.6     stringi_1.4.6    rmarkdown_2.1   
#>  [9] highr_0.8        knitr_1.28       stringr_1.4.0    xfun_0.13       
#> [13] digest_0.6.25    rlang_0.4.5.9000 evaluate_0.14

Created on 2020-04-16 by the reprex package (v0.3.0) And this:

packageVersion("gtfsrouter")
#> [1] '0.0.1.3'

Created on 2020-04-16 by the reprex package (v0.3.0)

It looks like I am missing something in my R settings but no idea what can it be. Do you have any hint about what should I look for? Thanks

mpadge commented 4 years ago

The first error:

Warning in system2(zip, args, input = input, invisible = TRUE): '"zip"' not found

indicates just what it says at the end. The command unzip is failing for some OS-specific reason. Try running unzip directly to see whether you can unzip archives within R or not.

The remainder are repeats of this kind of pattern:

#> (paste0("(", : '(unzip -p gtfs.zip "calendar_dates.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c64af4fb9' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpcdC2YK\file95c64af4fb9' has size 0.

The first line is the actual code from gtfsrouter that is called here, which is an attempt to extract that specific file from the zip archive. I don't know what "error code 1" means on your windows OS, but the next line is an attempt to read that file which has obviously not been successfully extracted. So all of these problems appear to arise because unzip is not working in R. Again, and as said above, try running that command manually until you figure out to get it working, and then you should at least be able to start.

My output from your code above:

library (gtfsrouter)
if (!file.exists ("warsaw-gtfs.zip"))
    download.file("https://mkuran.pl/feed/ztm/ztm-latest.zip", "warsaw-gtfs.zip")
gtfs <- extract_gtfs("warsaw-gtfs.zip")
#> Warning: This feed contains no transfers.txt
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", :
#> Detected 7 column names but the data has 8 columns (i.e. invalid file). Added 1
#> extra default column name for the first column which is guessed to be row names
#> or an index. Use setnames() afterwards if this guess is not correct, or fix the
#> file write command that created the file to create a valid file.
#> Warning in data.table::fread(cmd = paste0("unzip -p ", filename, " \"", :
#> Discarded single-line footer: <<"Bus shapes (under ODbL licnese): ©
#> OpenStreetMap contributors",pl,0,0,1,1,"https://www.openstreetmap.org/
#> copyright">>

Created on 2020-04-16 by the reprex package (v0.3.0)

So it will read, but you won't be able to do any routing with it until a "transfers.txt" table has been constructed and inserted. (You can either insert this in the directory itself, re-zip the whole thing, and extract_gtfs again, or you can just add a data.table version of the transfers data to the result in R, and go from there.)

mpadge commented 4 years ago

A rough go at constructing the lists of stops needed for a "transfers.txt" table. For each stop, this code gets all neighbouring stops that do not lie on the same services.

stops <- gtfs$stops
# Get matrix of dists between all stops
d <- geodist::geodist (stops [, c ("stop_lon", "stop_lat")], measure = "haversine")
# join service numbers on to stop table, so we can select only those stops that
# are part of different services
stop_service <- gtfs$stop_times [, c ("trip_id", "stop_id")]
stop_service <- stop_service [!duplicated (stop_service), ]
stop_service$services <- gtfs$trips$service_id [match (stop_service$trip_id, gtfs$trips$trip_id)]
stop_service$trip_id <- NULL
stop_service <- stop_service [which (!duplicated (stop_service)), ]

transfers <- pbapply::pblapply (seq (nrow (stops)), function (i) {
                         stopi <- gtfs$stops$stop_id [i]
                         nbs <- gtfs$stops$stop_id [which (d [i, ] < 200)] # as first go, get all neighbouring stops within 200m
                         services <- unique (stop_service$services [stop_service$stop_id == stopi])
                         service_stops <- unique (stop_service$stop_id [stop_service$services %in% services])
                         nbs [which (!nbs %in% service_stops)]
     })

That is pretty inefficient, and takes 15-20 minutes or so, but returns a list of neighbouring stops that are not part of the same service, and so which can be used as transfer destinations. There must be a better way to do it, but it works for the moment. The next step would be to estimate transfer times, which could be done by downloading the street network, using dodgr to weight it for pedestrian travel (weight_streetnet(wt_profile = "foot"), and use dodgr_times() from all stops to all nbs returned by the above code.


UPDATE: Slight change to code makes it much more efficient, reducing time to a few seconds

stmarcin commented 4 years ago

Thanks for resposne.

unzip seems to work properly in R.

if (!file.exists ("warsaw-gtfs.zip"))
      download.file("https://mkuran.pl/feed/ztm/ztm-latest.zip", "warsaw-gtfs.zip")
unzip("warsaw-gtfs.zip")

Created on 2020-04-16 by the reprex package (v0.3.0)

I went through the code and the following is working as well

library (gtfsrouter)
if (!file.exists ("warsaw-gtfs.zip"))
      download.file("https://mkuran.pl/feed/ztm/ztm-latest.zip", "warsaw-gtfs.zip")
flist <- utils::unzip ("warsaw-gtfs.zip", list = TRUE)

Created on 2020-04-16 by the reprex package (v0.3.0)

The problem is here (I guess, with this: "unzip -p "):

if (!file.exists ("warsaw-gtfs.zip"))
      download.file("https://mkuran.pl/feed/ztm/ztm-latest.zip", "warsaw-gtfs.zip")
flist <- utils::unzip ("warsaw-gtfs.zip", list = TRUE)
f <- flist$Name[1]
fout <- data.table::fread (cmd = paste0 ("unzip -p ", "warsaw-gtfs.zip",
                                         " \"", f, "\""),
                           integer64 = "character",
                           showProgress = FALSE)
#> Warning in (if (.Platform$OS.type == "unix") system else shell)
#> (paste0("(", : '(unzip -p warsaw-gtfs.zip "calendar_dates.txt") > C:
#> \Users\stepniak\AppData\Local\Temp\RtmpYziY2B\file1c50bb6157e' execution failed
#> with error code 1
#> Warning in data.table::fread(cmd = paste0("unzip -p ", "warsaw-gtfs.zip", : File
#> 'C:\Users\stepniak\AppData\Local\Temp\RtmpYziY2B\file1c50bb6157e' has size 0.
#> Returning a NULL data.table.

Created on 2020-04-16 by the reprex package (v0.3.0)

It seems that uzinp command in Windows is a bit problematic. I've spent couple of hours trying to solve it but unfortunately, I haven't found any solution. To what extend the result of the extract_gtfs() provides with a different result than gtfs2gps read_gtfs()? It returns a list of data.tables. Or read_gtfs() from tidytransit ?

mpadge commented 4 years ago

Ah, i see. That error has come up before, but i couldn't work it out (having no access to any windows machines). But now i get it. The above commit should fix things for you, so could you please install current version here:

remotes::install_github("atfutures/gtfs-router")

and let me know whether that works? That would be really helpful!


And in regard to your other questions: This package does different things from those others, primarily through pre-processing the data ready to be used for routing. That happens somewhat behind the scenes, so the results of gtfsrouter::extract_gtfs() will look mostly similar to tidytransit::read_gtfs(), but the feeds in this package are able to be fed to the gtfs_timetable() function to then use for routing or isochrone calculations, whereas feeds from the other packages are just the raw data in unprocessed form.

stmarcin commented 4 years ago

It works! Many thanks. And thanks for the code for transfers.txt. Tomorrow I am going to work with that and will let you know how it works. One more question: I browse through the code but haven't found it: does gtfsrouter work with frequency.txt?

mpadge commented 4 years ago

One more question: I browse through the code but haven't found it: does gtfsrouter work with frequency.txt?

Not currently, because there are actually very few systems that use that. Madrid is one of the only ones i've encountered. It's pretty straightforward to incorporate it, but there seems to be little demand to date - #13 is the issue, and has received no comments in the past year. That said, let me know if you need it, and i should hopefully be able to find time to incorporate that.

mpadge commented 4 years ago

@stmarcin you happy to close this issue now, and continue any necessary discussions in other issues (#14 , #13)?

stmarcin commented 4 years ago

Solved. Thanks @mpadge!