tidyverse / lubridate

Make working with dates in R just that little bit easier
https://lubridate.tidyverse.org
GNU General Public License v3.0
728 stars 207 forks source link

ymd_hm() bug #1105

Closed martinschlund closed 1 year ago

martinschlund commented 1 year ago

In R version 4.2.2 ymd_hm() is not working.

dates <- c("2012-02-12 18:03", "2012-02-15 18:03")

ymd_hm(dates)

[1] NA NA
Warning message:
All formats failed to parse. No formats found. 

Gives NA and a warning.


parse_date_time(dates, orders = "ymd HM")

works fine though.

Sessioninfo:

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=Danish_Denmark.utf8  LC_CTYPE=Danish_Denmark.utf8    LC_MONETARY=Danish_Denmark.utf8
[4] LC_NUMERIC=C                    LC_TIME=Danish_Denmark.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] finalfit_1.0.5   openxlsx_4.2.5.1 janitor_2.1.0    readxl_1.4.1     lubridate_1.9.0  timechange_0.1.1 forcats_0.5.2   
 [8] stringr_1.5.0    dplyr_1.0.10     purrr_1.0.0      readr_2.1.3      tidyr_1.2.1      tibble_3.1.8     ggplot2_3.4.0   
[15] tidyverse_1.3.2 

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0    splines_4.2.2       lattice_0.20-45     haven_2.5.1         gargle_1.2.1        snakecase_0.11.0   
 [7] colorspace_2.0-3    vctrs_0.5.1         generics_0.1.3      survival_3.4-0      utf8_1.2.2          rlang_1.0.6        
[13] pillar_1.8.1        glue_1.6.2          withr_2.5.0         DBI_1.1.3           dbplyr_2.2.1        modelr_0.1.10      
[19] lifecycle_1.0.3     munsell_0.5.0       gtable_0.3.1        cellranger_1.1.0    rvest_1.0.3         zip_2.2.2          
[25] tzdb_0.3.0          fansi_1.0.3         broom_1.0.2         Rcpp_1.0.9          backports_1.4.1     scales_1.2.1       
[31] googlesheets4_1.0.1 jsonlite_1.8.4      fs_1.5.2            hms_1.1.2           stringi_1.7.8       cowplot_1.1.1      
[37] grid_4.2.2          cli_3.5.0           tools_4.2.2         magrittr_2.0.3      mice_3.15.0         crayon_1.5.2       
[43] pkgconfig_2.0.3     Matrix_1.5-1        ellipsis_0.3.2      xml2_1.3.3          reprex_2.0.2        googledrive_2.0.0  
[49] assertthat_0.2.1    httr_1.4.4          rstudioapi_0.14     boot_1.3-28         R6_2.5.1            compiler_4.2.2   
webbp commented 1 year ago

mdy_hm() and dmy_hm() fail the same way. For example,

> dmy_hm('02012023 12:08')   # doesn't work anymore
[1] NA
Warning message:
All formats failed to parse. No formats found.
> dmy_hm('02-01-2023 12:08') # doesn't work anymore
[1] NA
Warning message:
All formats failed to parse. No formats found.
> dmy_hm('020120231208')     # still works
[1] "2023-01-02 12:08:00 UTC"
> dmy_hm('02012023 1208')    # also still works
[1] "2023-01-02 12:08:00 UTC"
> parse_date_time('02-01-2023 12:08', orders='dmy HM') # still works
[1] "2023-01-02 12:08:00 UTC"
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=nl_NL.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] lubridate_1.9.0   timechange_0.1.1  arrow_10.0.1      data.table_1.14.6

loaded via a namespace (and not attached):
 [1] ps_1.7.2         assertthat_0.2.1 R6_2.5.1         magrittr_2.0.3   rlang_1.0.6      cli_3.4.1        vctrs_0.5.1      generics_0.1.3   bit64_4.0.5
[10] glue_1.6.2       purrr_0.3.5      bit_4.0.5        compiler_4.2.0   tidyselect_1.2.0
quesadagranja commented 1 year ago

I have been noticing this bug since the beginning of December and it is very serious in my case because I had to modify existing code of my project.

What I have discovered is that the only dates that ymd_hm() parses well are those in which the hour and minutes are not separated:

> lubridate::ymd_hm("2023-01-19 2338")
[1] "2023-01-19 23:38:00 UTC"
> lubridate::ymd_hm("2023/01/19 2338")
[1] "2023-01-19 23:38:00 UTC"
> lubridate::ymd_hm("2023/01/192338")
[1] "2023-01-19 23:38:00 UTC"
> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8    LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
[5] LC_TIME=Spanish_Spain.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.2.2    generics_0.1.3    tools_4.2.2       lubridate_1.9.0   data.table_1.14.6 timechange_0.1.1 
vspinu commented 1 year ago

The bug is locale specific. Same as https://github.com/tidyverse/lubridate/issues/1097 and was fixed in devel at the end of November. A I am starting the release process.