UrbanAnalyst / gtfsrouter

Routing and analysis engine for GTFS (General Transit Feed Specification) data
https://urbananalyst.github.io/gtfsrouter/
80 stars 17 forks source link

Reduce ratios of user / elapsed times in examples #109

Closed mpadge closed 11 months ago

mpadge commented 1 year ago

CRAN rejects current submission attempts because most examples generate "CPU time > 2.5 times elapsed times." These are the "user" and "elapsed" times in proc.time/system.time. These ratios seem to be anomalously high because of the data.table filtering calls in timetable construction.

mpadge commented 1 year ago

The above commits implement data.table's "fast sub-selection" syntax, and only make things much worse:

library (gtfsrouter)
packageVersion ("gtfsrouter")
#> [1] '0.1.0.8'
day <- route_pattern <- NULL
quiet <- FALSE
berlin_gtfs_to_zip ()
#> [1] "/tmp/Rtmp0oVroZ/vbb.zip"
f <- file.path (tempdir (), "vbb.zip")
gtfs <- extract_gtfs (f)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive  
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed 
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds 
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
test <- function (gtfs) {
    from <- "Innsbrucker Platz"
    to <- "Alexanderplatz"
    start_time <- 12 * 3600 + 120

    route <- gtfs_route (gtfs, from = from, to = to, start_time = start_time)
}
pt0 <- proc.time ()
test (gtfs)
#> Day not specified; extracting timetable for friday
pt1 <- proc.time ()
timing <- pt1 - pt0
print (timing)
#>    user  system elapsed 
#>   0.323   0.005   0.065
cli::cli_h2 (paste0 ("ratio: ", round (timing [1] / timing [3], digits = 1)))
#> 
#> ── ratio: 5 ──

Created on 2023-06-30 with reprex v2.0.2

So data.table is definitely the issue here, and all examples are simply going to have to be switched off to get this update on CRAN.

mpadge commented 11 months ago

Solution: data.table::setDTthreads(1L) in all examples.