ipeaGIT / gtfs2gps

Convert GTFS data into a data.table with GPS-like records in R
https://ipeagit.github.io/gtfs2gps/
Other
71 stars 10 forks source link

`departure_time` and `cum_time` is <NA> for all rows #206

Closed abrac closed 3 years ago

abrac commented 3 years ago

Hi there, I tried using the gtfs2gps() function on my dataset, but I got NA values for all the departure times. When I tried running the vignette, I get the same issue with the sao dataset, but not with the poa dataset. Here is the output of the vignette when I run it on my machine:

# Loading data After loading the package, GTFS data can be read into R by using `read_gtfs()`. ```R library("gtfs2gps") sao <- read_gtfs(system.file("extdata/saopaulo.zip", package ="gtfs2gps")) ``` ``` ## Unzipped the following files to /tmp/RtmpRi9sI1/gtfsio: ## * agency.txt ## * calendar.txt ## * frequencies.txt ## * routes.txt ## * shapes.txt ## * stop_times.txt ## * stops.txt ## * trips.txt ## Reading agency.txt ## Reading calendar.txt ## Reading frequencies.txt ## Reading routes.txt ## Reading shapes.txt ## Reading stop_times.txt ## Reading stops.txt ## Reading trips.txt ``` ```R names(sao) ``` ``` ## [1] "agency" "calendar" "frequencies" "routes" "shapes" ## [6] "stop_times" "stops" "trips" ``` ```R head(sao$trips) ``` ``` ## route_id service_id trip_id shape_id ## 1: 121G-10 USD 121G-10-0 52421 ## 2: 148L-10 USD 148L-10-0 52857 ## 3: 148L-10 USD 148L-10-1 52858 ## 4: 1720-21 USD 1720-21-0 52936 ## 5: 1721-10 USD 1721-10-0 52941 ## 6: 1726-10 USD 1726-10-0 52429 ``` # Subsetting GTFS Data In the code below we filter only shape ids between 53000 and 53020. ```R library(magrittr) object.size(sao) %>% format(units = "Kb") ``` ## [1] "2148.6 Kb" ```R sao_small <- gtfs2gps::filter_by_shape_id(sao, c(51338, 51956, 51657)) object.size(sao_small) %>% format(units = "Kb") ``` ## [1] "99.8 Kb" After subsetting the data, it is also possible to save it as a new GTFS file using `write_gtfs()`, as shown below. ```R write_gtfs(sao_small, "sao_small.zip") ``` # Converting to GPS-like format To convert GTFS to GPS-like format, use `gtfs2gps()`. See the example below. ```R sao_gps <- gtfs2gps("sao_small.zip", spatial_resolution = 50) head(sao_gps) ``` ## id shape_id trip_id trip_number route_type shape_pt_lon shape_pt_lat ## 1: 1 51338 5010-10-0 1 3 -46.63120 -23.66268 ## 2: 2 51338 5010-10-0 1 3 -46.63117 -23.66273 ## 3: 3 51338 5010-10-0 1 3 -46.63108 -23.66288 ## 4: 4 51338 5010-10-0 1 3 -46.63095 -23.66316 ## 5: 5 51338 5010-10-0 1 3 -46.63082 -23.66345 ## 6: 6 51338 5010-10-0 1 3 -46.63111 -23.66364 ## departure_time stop_id stop_sequence dist cumdist cumtime ## 1: NA 0.000000 [m] 0.000000 [m] NA [s] ## 2: 3703053 1 7.230445 [m] 7.230445 [m] NA [s] ## 3: NA 18.369274 [m] 25.599720 [m] NA [s] ## 4: NA 34.505965 [m] 60.105685 [m] NA [s] ## 5: NA 34.505965 [m] 94.611650 [m] NA [s] ## 6: NA 36.478776 [m] 131.090426 [m] NA [s] ## speed ## 1: NA [km/h] ## 2: 26.5931 [km/h] ## 3: 26.5931 [km/h] ## 4: 26.5931 [km/h] ## 5: 26.5931 [km/h] ## 6: 26.5931 [km/h] The function `gtfs2gps()` automatically recognizes whether the GTFS data brings detailed `stop_times.txt` information or whether it is a `frequency.txt` GTFS file. A sample data of a GTFS with detailed `stop_times.txt` can be found below: ```R poa <- system.file("extdata/poa.zip", package ="gtfs2gps") poa_gps <- gtfs2gps(poa, spatial_resolution = 50) head(poa_gps) ``` ## id shape_id trip_id trip_number route_type shape_pt_lon shape_pt_lat ## 1: 1 176-1 176-1@1#602 1 3 -51.22170 -30.14870 ## 2: 2 176-1 176-1@1#602 1 3 -51.22143 -30.14875 ## 3: 3 176-1 176-1@1#602 1 3 -51.22114 -30.14878 ## 4: 4 176-1 176-1@1#602 1 3 -51.22085 -30.14881 ## 5: 5 176-1 176-1@1#602 1 3 -51.22038 -30.14887 ## 6: 6 176-1 176-1@1#602 1 3 -51.21991 -30.14894 ## departure_time stop_id stop_sequence dist cumdist ## 1: 06:02:00 59 1 0.00000 [m] 0.00000 [m] ## 2: 06:02:04 NA 26.67895 [m] 26.67895 [m] ## 3: 06:02:07 NA 28.15588 [m] 54.83483 [m] ## 4: 06:02:11 NA 28.15588 [m] 82.99072 [m] ## 5: 06:02:17 NA 45.81472 [m] 128.80544 [m] ## 6: 06:02:23 NA 45.81472 [m] 174.62016 [m] ## cumtime speed ## 1: 0.000000 [s] 27.06675 [km/h] ## 2: 3.548421 [s] 27.06675 [km/h] ## 3: 7.293280 [s] 27.06675 [km/h] ## 4: 11.038140 [s] 27.06675 [km/h] ## 5: 17.131705 [s] 27.06675 [km/h] ## 6: 23.225270 [s] 27.06675 [km/h]
rafapereirabr commented 3 years ago

Hi @abrac. Thank you for opening this issue. What version of the gtfs2gps package are you using? Also, could you please share your sessionInfo() in a comment below?

abrac commented 3 years ago

Hi @rafapereirabr! I'm so sorry for the delay! The email notification went to my junk folder.

I am using the latest release, v1.5.0. I installed it using the install.packages("gtfs2gps") command.

Here is the sessionInfo():

```R sessionInfo() ``` ``` ## R version 4.0.4 (2021-02-15) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 21.04 ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3 ## ## locale: ## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_ZA.UTF-8 LC_COLLATE=en_GB.UTF-8 ## [5] LC_MONETARY=en_ZA.UTF-8 LC_MESSAGES=en_GB.UTF-8 ## [7] LC_PAPER=en_ZA.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_ZA.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] units_0.7-2 sfheaders_0.4.0 Rcpp_1.0.7 sf_1.0-2 ## [5] data.table_1.14.0 purrr_0.3.4 magrittr_2.0.1 gtfs2gps_1.5-0 ## ## loaded via a namespace (and not attached): ## [1] xfun_0.25 listenv_0.8.0 lattice_0.20-41 vctrs_0.3.8 ## [5] htmltools_0.5.2 s2_1.0.6 yaml_2.2.1 utf8_1.2.2 ## [9] rlang_0.4.11 e1071_1.7-8 pillar_1.6.2 DBI_1.1.1 ## [13] sp_1.4-5 wk_0.5.0 lifecycle_1.0.0 stringr_1.4.0 ## [17] gtfsio_0.1.2 progressr_0.8.0 zip_2.2.0 future_1.22.1 ## [21] codetools_0.2-18 evaluate_0.14 knitr_1.33 tzdb_0.1.2 ## [25] fastmap_1.1.0 parallel_4.0.4 class_7.3-18 fansi_0.5.0 ## [29] furrr_0.2.3 KernSmooth_2.23-18 readr_2.0.1 classInt_0.4-3 ## [33] lwgeom_0.2-7 parallelly_1.27.0 hms_1.1.0 digest_0.6.27 ## [37] stringi_1.7.4 grid_4.0.4 rgdal_1.5-23 tools_4.0.4 ## [41] proxy_0.4-26 tibble_3.1.4 crayon_1.4.1 pkgconfig_2.0.3 ## [45] ellipsis_0.3.2 rmarkdown_2.10 R6_2.5.1 globals_0.14.0 ## [49] compiler_4.0.4 ```
abrac commented 3 years ago

Hi again. I tried downgrading to version v1.4.0. You won't believe it, but now the issue is fixed in the sao example, but the issue pops up in the poa example. So, it's the opposite of what happened in my original post. 🤔

Here are the results of the vignette when I run it with gtfs2gps v1.4.0:

# Loading data After loading the package, GTFS data can be read into R by using `read_gtfs()`. ``` r library("gtfs2gps") sao <- read_gtfs(system.file("extdata/saopaulo.zip", package ="gtfs2gps")) ``` ``` ## Reading 'agency.txt' ## Reading 'routes.txt' ## Reading 'stops.txt' ## Reading 'stop_times.txt' ## Reading 'shapes.txt' ## Reading 'trips.txt' ## Reading 'calendar.txt' ## Reading 'frequencies.txt' ``` ``` r names(sao) ``` ``` ## [1] "agency" "routes" "stops" "stop_times" "shapes" ## [6] "trips" "calendar" "frequencies" ``` ``` r head(sao$trips) ``` ``` ## route_id service_id trip_id trip_headsign direction_id shape_id ## 1: 121G-10 USD 121G-10-0 Metrô Tucuruvi 0 52421 ## 2: 148L-10 USD 148L-10-0 Lapa 0 52857 ## 3: 148L-10 USD 148L-10-1 Cohab Antártica 1 52858 ## 4: 1720-21 USD 1720-21-0 Metrô Tucuruvi 0 52936 ## 5: 1721-10 USD 1721-10-0 Metrô Carandiru 0 52941 ## 6: 1726-10 USD 1726-10-0 Metrô Santana 0 52429 ``` # Subsetting GTFS Data In the code below we filter only shape ids between 53000 and 53020. ``` r library(magrittr) object.size(sao) %>% format(units = "Kb") ``` ``` ## [1] "2448.6 Kb" ``` ``` r sao_small <- gtfs2gps::filter_by_shape_id(sao, c(51338, 51956, 51657)) object.size(sao_small) %>% format(units = "Kb") ``` ``` ## [1] "110.7 Kb" ``` After subsetting the data, it is also possible to save it as a new GTFS file using `write_gtfs()`, as shown below. ``` r write_gtfs(sao_small, "sao_small.zip") ``` # Converting to GPS-like format To convert GTFS to GPS-like format, use `gtfs2gps()`. See the example below. ``` r sao_gps <- gtfs2gps("sao_small.zip", spatial_resolution = 50) head(sao_gps) ``` ``` ## id shape_id trip_id trip_number route_type shape_pt_lon shape_pt_lat ## 1: 1 51338 5010-10-0 1 3 -46.63120 -23.66268 ## 2: 2 51338 5010-10-0 1 3 -46.63117 -23.66273 ## 3: 3 51338 5010-10-0 1 3 -46.63108 -23.66288 ## 4: 4 51338 5010-10-0 1 3 -46.63095 -23.66316 ## 5: 5 51338 5010-10-0 1 3 -46.63082 -23.66345 ## 6: 6 51338 5010-10-0 1 3 -46.63111 -23.66364 ## departure_time stop_id stop_sequence dist cumdist ## 1: 04:00:00 3703053 1 0.000000 [m] 0.000000 [m] ## 2: 04:00:01 NA 7.230445 [m] 7.230445 [m] ## 3: 04:00:04 NA 18.369274 [m] 25.599720 [m] ## 4: 04:00:09 NA 34.505965 [m] 60.105685 [m] ## 5: 04:00:13 NA 34.505965 [m] 94.611650 [m] ## 6: 04:00:19 NA 36.478776 [m] 131.090426 [m] ## cumtime speed ## 1: 0.000000 [s] 25.44852 [km/h] ## 2: 1.022834 [s] 25.44852 [km/h] ## 3: 3.621389 [s] 25.44852 [km/h] ## 4: 8.502673 [s] 25.44852 [km/h] ## 5: 13.383957 [s] 25.44852 [km/h] ## 6: 18.544319 [s] 25.44852 [km/h] ``` The function `gtfs2gps()` automatically recognizes whether the GTFS data brings detailed `stop_times.txt` information or whether it is a `frequency.txt` GTFS file. A sample data of a GTFS with detailed `stop_times.txt` can be found below: ``` r poa <- system.file("extdata/poa.zip", package ="gtfs2gps") poa_gps <- gtfs2gps(poa, spatial_resolution = 50) head(poa_gps) ``` ``` ## id shape_id trip_id trip_number route_type shape_pt_lon shape_pt_lat ## 1: 1 A141-1 A141-1@5#2340 1 3 -51.14692 -30.14979 ## 2: 2 A141-1 A141-1@5#2340 1 3 -51.14651 -30.14997 ## 3: 3 A141-1 A141-1@5#2340 1 3 -51.14610 -30.15014 ## 4: 4 A141-1 A141-1@5#2340 1 3 -51.14570 -30.15031 ## 5: 5 A141-1 A141-1@5#2340 1 3 -51.14532 -30.15048 ## 6: 6 A141-1 A141-1@5#2340 1 3 -51.14493 -30.15064 ## departure_time stop_id stop_sequence dist cumdist cumtime ## 1: NA 0.00000 [m] 0.00000 [m] NA [s] ## 2: NA 43.61804 [m] 43.61804 [m] NA [s] ## 3: NA 43.61804 [m] 87.23608 [m] NA [s] ## 4: 434 1 43.32548 [m] 130.56155 [m] NA [s] ## 5: NA 41.05718 [m] 171.61874 [m] NA [s] ## 6: NA 41.05718 [m] 212.67592 [m] NA [s] ## speed ## 1: NA [km/h] ## 2: NA [km/h] ## 3: NA [km/h] ## 4: 10.07926 [km/h] ## 5: 10.07926 [km/h] ## 6: 10.07926 [km/h] ``` ``` r sessionInfo() ``` ``` ## R version 4.0.4 (2021-02-15) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 21.04 ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3 ## ## locale: ## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_ZA.UTF-8 LC_COLLATE=en_GB.UTF-8 ## [5] LC_MONETARY=en_ZA.UTF-8 LC_MESSAGES=en_GB.UTF-8 ## [7] LC_PAPER=en_ZA.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_ZA.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] units_0.7-2 sfheaders_0.4.0 Rcpp_1.0.7 sf_1.0-2 ## [5] data.table_1.14.0 purrr_0.3.4 magrittr_2.0.1 gtfs2gps_1.4-0 ## ## loaded via a namespace (and not attached): ## [1] zip_2.2.0 compiler_4.0.4 progressr_0.8.0 class_7.3-18 ## [5] tools_4.0.4 digest_0.6.27 evaluate_0.14 lattice_0.20-41 ## [9] rlang_0.4.11 DBI_1.1.1 parallel_4.0.4 yaml_2.2.1 ## [13] rgdal_1.5-23 xfun_0.25 fastmap_1.1.0 e1071_1.7-8 ## [17] furrr_0.2.3 stringr_1.4.0 s2_1.0.6 knitr_1.33 ## [21] vctrs_0.3.8 globals_0.14.0 classInt_0.4-3 grid_4.0.4 ## [25] listenv_0.8.0 parallelly_1.27.0 rmarkdown_2.10 sp_1.4-5 ## [29] ellipsis_0.3.2 codetools_0.2-18 htmltools_0.5.2 future_1.22.1 ## [33] KernSmooth_2.23-18 stringi_1.7.4 proxy_0.4-26 wk_0.5.0 ## [37] lwgeom_0.2-7 ```

However, for my GTFS dataset, I still get NA values for the departure times when I run gtfs2gps(). So downgrading to v1.4.0 didn't fix my issue.

abrac commented 3 years ago

Just an update: I tried a few more things. I tried downgrading to version 1.3.2. That didn't work. I then saw @pedro-andrade-inpe's latest merge-request: #208. I compiled it from source and tried using that version, but it also didn't work. I am not sure what is the problem, but perhaps it might be a problem with my data...? I will try figure it out, but I'm not sure where to start. Just to clarify my problem: No matter which version of gtfs2gps I use, the departure_time and cum_time columns are NA for all rows. Although I didn't mention it before, the cum_time column is the one that I need the most.

pedro-andrade-inpe commented 3 years ago

@abrac My last merge was mostly related to an issue opened by CRAN, but I also fixed a small issue related to departure_time.
After updating the CRAN version this issue will be my priority.

Joaobazzo commented 3 years ago

In my experience several gtfs2gps::gtfs2gps output’s we get NA cum_times because the departure_time is not available for some route segments (valid stop_times). This means that we can’t estimate the difference in time (consequently average speed) between valid stop_times.

One of the ways I use to get around these problems of cum_time is to replace invalid speeds for valid ones In this example, I replace the problematic speeds with the mean value of the valid ones.

    library(data.table)
    library(magrittr)
    spo_gps <- gtfs2gps::read_gtfs(system.file("extdata/saopaulo.zip", package = "gtfs2gps")) %>%
      gtfs2gps::filter_by_shape_id(c("52421", "52857")) %>%
      gtfs2gps::filter_single_trip() %>%
      gtfs2gps::gtfs2gps() %>%
      base::suppressMessages()
    head(spo_gps, 2)
    #>    id shape_id   trip_id trip_number route_type shape_pt_lon shape_pt_lat
    #> 1:  1    52421 121G-10-0           1          3    -46.56379    -23.52212
    #> 2:  2    52421 121G-10-0           1          3    -46.56353    -23.52198
    #>    departure_time stop_id stop_sequence         dist      cumdist cumtime
    #> 1:           <NA>    <NA>            NA  0.00000 [m]  0.00000 [m]  NA [s]
    #> 2:           <NA>    <NA>            NA 30.46157 [m] 30.46157 [m]  NA [s]
    #>        speed
    #> 1: NA [km/h]
    #> 2: NA [km/h]

replace speeds with problems by ‘NA’

    spo_gps[, speed := as.numeric(speed)]
    spo_gps[speed == "Inf" | is.na(speed) | is.nan(speed), speed := NA]
    spo_gps[speed > 80 | speed < 2, speed := NA] # too slow or too fast

fill ‘NA’ speed values by mean speed

    spo_gps[is.na(speed), speed := mean(spo_gps$speed, na.rm = TRUE), by = .(shape_id)]
    spo_gps[, speed := units::set_units(speed, "km/h")]

travelled time

    spo_gps[, time := (dist / speed)]

definning a new cumtime

    spo_gps[, cumtime_new := cumsum(time), by = .(shape_id, trip_id, trip_number)]
    spo_gps[, cumtime_new := units::set_units(cumtime_new, "s")]
    head(spo_gps)
    #>    id shape_id   trip_id trip_number route_type shape_pt_lon shape_pt_lat
    #> 1:  1    52421 121G-10-0           1          3    -46.56379    -23.52212
    #> 2:  2    52421 121G-10-0           1          3    -46.56353    -23.52198
    #> 3:  3    52421 121G-10-0           1          3    -46.56327    -23.52184
    #> 4:  4    52421 121G-10-0           1          3    -46.56302    -23.52171
    #> 5:  5    52421 121G-10-0           1          3    -46.56276    -23.52157
    #> 6:  6    52421 121G-10-0           1          3    -46.56249    -23.52143
    #>    departure_time   stop_id stop_sequence         dist       cumdist cumtime
    #> 1:           <NA>      <NA>            NA  0.00000 [m]   0.00000 [m]  NA [s]
    #> 2:           <NA>      <NA>            NA 30.46157 [m]  30.46157 [m]  NA [s]
    #> 3:           <NA> 910000819             1 30.46157 [m]  60.92315 [m]  NA [s]
    #> 4:           <NA>      <NA>            NA 30.46157 [m]  91.38472 [m]  NA [s]
    #> 5:           <NA>      <NA>            NA 30.46157 [m] 121.84630 [m]  NA [s]
    #> 6:           <NA> 910000820             2 31.45530 [m] 153.30159 [m]  NA [s]
    #>               speed            time    cumtime_new
    #> 1: 15.510057 [km/h] 0.000000000 [h]   0.000000 [s]
    #> 2: 15.510057 [km/h] 0.001963989 [h]   7.070359 [s]
    #> 3:  3.695138 [km/h] 0.008243691 [h]  36.747646 [s]
    #> 4:  3.695138 [km/h] 0.008243691 [h]  66.424934 [s]
    #> 5:  3.695138 [km/h] 0.008243691 [h]  96.102222 [s]
    #> 6:  6.630198 [km/h] 0.004744247 [h] 113.181511 [s]

with some more work you can reestimate the departure_time values based on the new cum_times.

I hope this helps a little bit.

Created on 2021-09-03 by the reprex package (v2.0.0)

abrac commented 3 years ago

Thank you everyone for your help!

@Joaobazzo: Your code snippets helped a ton! I wanted to do something like that, but wasn't sure how, since I am not familiar with R. I have used your code on my dataset. 👍🏽

rafapereirabr commented 3 years ago

Thanks for sharing the solution @Joaobazzo . Given @abrac 's question and the fact that GTFS feeds often have data quality problems, I'm wondering wether we could include a parameter to the gtfs2gps() function to address this issue.

This could be, for example, a speed_correction parameter where the options would be:

It would also be great if we could allow the user to set a minimum and maximum speeds that would trigger those corrections, but we need to think a simple way to expose these parameters to users.

From a code development point of view, implimenting this would be relatively simple. We can creat a new support fuction that wraps us @Joaobazzo ' code. We would only need to apply this support function to the output object inside the gtfs2gps() function, right before returning the output.

abrac commented 3 years ago

Hi all. I just wanted to ask a follow up question. @Joaobazzo previously mentioned:

with some more work you can reestimate the departure_time values based on the new cum_times.

I wanted to ask: will the correction of departure_time values also be done through the changes proposed by @rafapereirabr?

rafapereirabr commented 3 years ago

Hi @abrac. Yes, the function will update the departure_time column accordingly.

rafapereirabr commented 3 years ago

Hi @abrac, we have added to the package a new function adjust_speed() to address this issue. Once you have the GPS-like output of the gtfs2gps() function, you can use adjust_speed() to 'fix' those problematic trips. Please have a look at the new function and let us know what you think.

abrac commented 3 years ago

Wow, I see a lot of commits were made over the last few days! Thanks so much everyone! I will try it out today and let you know 😊.

abrac commented 3 years ago

I'm so sorry for the delay! I've tested it with two datasets that I am currently working with, and it worked perfectly! 🙌͏🏽

I will continue using this new version of gtfs2gps, as I process the remaining 6 or 7 datasets that I am planning to work with. Will let you know if I come across any problems.

abrac commented 3 years ago

I realized that the two datasets which I tested with worked fine in the previous version of gtfs2gps too. So in other words, they did not have the quality issues which caused me to open this thread.

So, now I've tested with the original dataset I was using when I opened this thread. Although the adjust_speed function is re-calculating the cumtime and speed columns correctly, it is not updating the departure_time column.

I'm leaving this issue closed, as the departure_time is not too important for me anymore. However, just thought I would report the issue just in case.

A snippet of the results: id, shape_id, trip_id, trip_number, route_type, shape_pt_lon, shape_pt_lat, departure_time, stop_id, stop_sequence, dist, cumdist, cumtime, speed 1, taxi_IC001_O_shape, taxi_IC001_O_19_00_00#1, 1, 3, 32.5628, 0.350499999999997, NA, NA, NA, 0, 0, 0, 17.1546742140325 2, taxi_IC001_O_shape, taxi_IC001_O_19_00_00#1, 1, 3, 32.5627850000005, 0.350800000000007, NA, NA, NA, 33.4375644070299, 33.4375644070299, 7.01705146733946, 17.1546742140325 3, taxi_IC001_O_shape, taxi_IC001_O_19_00_00#1, 1, 3, 32.56277, 0.351100000000002, NA, guard.exposing.vintages, 1, 33.43756440798, 66.8751288150099, 32.2160099162745, 4.77699235516678 As you can see, row 3 has `NA` departure_time.