tidyverse / vroom

Fast reading of delimited files
https://vroom.r-lib.org
Other
618 stars 59 forks source link

24 hour date time issue #246

Open j-sirgo opened 4 years ago

j-sirgo commented 4 years ago

Date times with 24 in the hour column don't read properly in vroom.

However, the conversion can be made using lubridate::ymd_hm or lubridate::ymd_hms.

library(vroom)
vroom("date\n2015-06-14 24:00\n2015-06-14 08:01:01", delim=",")
#> Rows: 2
#> Columns: 1
#> Delimiter: ","
#> dttm [1]: date
#> 
#> Use `spec()` to retrieve the guessed column specification
#> Pass a specification to the `col_types` argument to quiet this message
#> # A tibble: 2 x 1
#>   date               
#>   <dttm>             
#> 1 NA                 
#> 2 2015-06-14 08:01:01

# COULD READ AS STRING AND USE lubridate::ymd_hm TO READ
library(lubridate)
#> Warning: package 'lubridate' was built under R version 3.6.3
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
ymd_hm("2015-06-14 24:00")
#> [1] "2015-06-15 UTC"

# NEED TO USE ymd_hms() IF THERE ARE SECONDS
ymd_hms("2015-06-14 24:01:22")
#> [1] "2015-06-15 00:01:22 UTC"

Created on 2020-07-02 by the reprex package (v0.3.0)

Can a similar adaptation be made in vroom?

jimhester commented 4 years ago

hours greater than 24 hours don't seem like valid datetimes to me, I don't think this will change.

j-sirgo commented 4 years ago

Yes greater than 24:00:00 seem wrong, but what about exactly 24:00:00? Isn’t it valid ISO 8601 format?