r-lib / rlang

Low-level API for programming with R
https://rlang.r-lib.org
Other
508 stars 138 forks source link

`Date` and `POSIXct` are integerish #1506

Closed bart1 closed 1 year ago

bart1 commented 2 years ago

I just noticed dates and times are considered integerish:

require(rlang)
#> Loading required package: rlang
d<-as.Date('2020-1-1')
is.integer(d)
#> [1] FALSE
is_integer(d)
#> [1] FALSE
is_integerish(d)
#> [1] TRUE
t<-as.POSIXct('2020-1-1 00:01:00')
is.integer(t)
#> [1] FALSE
is_integer(t)
#> [1] FALSE
is_integerish(t)
#> [1] TRUE
t<-as.POSIXct('2020-1-1 00:01:00.010')
is.integer(t)
#> [1] FALSE
is_integer(t)
#> [1] FALSE
is_integerish(t)
#> [1] FALSE

This makes some sense thinking about how these are stored, as they are integer like numbers internally. On the other hand boolean values fail on is_integerish even though they can also be presented as 0 or 1. So I'm not sure what is correct/desired. However, in the first instance, in code I was writing I expected is_integerish to fail on times and dates (is it maybe is_bare_integerish i was looking for?). I think it would be good to atleast document this behavior in ?is_integerish so users are aware of this.

sessionInfo(
)
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=nl_NL.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=nl_NL.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=nl_NL.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] rstudioapi_0.14   knitr_1.40        magrittr_2.0.3    R.cache_0.16.0   
#>  [5] rlang_1.0.6       fastmap_1.1.0     fansi_1.0.3       stringr_1.4.1    
#>  [9] styler_1.7.0      highr_0.9         tools_4.2.1       xfun_0.33        
#> [13] R.oo_1.25.0       utf8_1.2.2        cli_3.4.1         withr_2.5.0      
#> [17] htmltools_0.5.3   yaml_2.3.5        digest_0.6.29     tibble_3.1.8     
#> [21] lifecycle_1.0.2   purrr_0.3.4       R.utils_2.12.0    vctrs_0.4.2      
#> [25] fs_1.5.2          glue_1.6.2        evaluate_0.16     rmarkdown_2.16   
#> [29] reprex_2.0.2      stringi_1.7.8     compiler_4.2.1    pillar_1.8.1     
#> [33] R.methodsS3_1.8.2 pkgconfig_2.0.3
lionel- commented 2 years ago

rlang predicates are about storage type. So you probably want to combine is_integerish() with is.numeric().

bart1 commented 2 years ago

Dear @lionel- thank you for the comment, I do recognize that and do understand the logic. Therefore I thought it would be good to document it (like is_integerish(TRUE) in the example). As is_integerish is not only about the storage mode (e.g. 10.0 as a double giving resulting in true), I suspect people want to use it mainly to avoid the result of is.integer(10) resulting in FALSE. (a quick random example i found: https://github.com/cran/hipread/blob/7c39e75486e06775040048988bca770af7c9341d/R/col_specs.R#L44 , I suspect the intention here is not to allow column selection with dates but rather to be more flexible how the numbers are written (disclaimer I don't know this package))

Would is.numeric be preferred over is_bare_integerish, if so why?

lionel- commented 2 years ago

is_bare_integerish() will return FALSE for factors and dates, so you don't need to check for is.numeric() with this predicate.

However you can use is.numeric(x) && is_integerish(x) to check for integerish numeric vectors.

I agree it'd be nice to have a user-facing predicate that deals with all this, but it's still unclear at the moment if this should live in rlang or in vctrs.

lionel- commented 1 year ago

There's some work on this front in standalone-check-types.R. Still missing predicate and vector variants, but there's now check_number_whole() and check_number_decimal().

Closing this in the meantime. Development will continue piecemeal.