r-lib / vctrs

Generic programming with typed R vectors
https://vctrs.r-lib.org
Other
289 stars 66 forks source link

`vec_math()` does not work with integer types #1949

Open inpowell opened 3 months ago

inpowell commented 3 months ago

I was surprised to find out that vec_math() does not have an implementation for integer types. This means, if implementing an S3 rcrd class which wraps around integers, we cannot implement a pass-down math generic as (for example)

vec_math.myrcrd <- function(.fn, .x, ...)
  new_myrcrd(vec_math(.fn, field(.x, 'int_field'), ...))

Reproducible example

library(vctrs)

Actual behaviour:

vec_math('sum', 0:10)
#> Error in `vec_math()`:
#> ! `vec_math.integer()` not implemented.

What I would expect: Identical to sum() on integers, which converts to double output.

sum(0:10)
#> [1] 55

Workaround if required:

vec_math('sum', as.double(0:10))
#> [1] 55

sessionInfo() output (checked on 0.6.5, and this reprex from latest master):

sessionInfo('vctrs')
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Fedora Linux 39 (Workstation Edition)
#> 
#> Matrix products: default
#> BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP;  LAPACK version 3.11.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
#>  [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
#>  [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Australia/Sydney
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> character(0)
#> 
#> other attached packages:
#> [1] vctrs_0.6.5.9000
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.36     methods_4.4.1     utf8_1.2.4        fastmap_1.2.0    
#>  [5] xfun_0.46         glue_1.7.0        knitr_1.48        htmltools_0.5.8.1
#>  [9] rmarkdown_2.27    lifecycle_1.0.4   utils_4.4.1       cli_3.6.3        
#> [13] fansi_1.0.6       reprex_2.1.1      graphics_4.4.1    grDevices_4.4.1  
#> [17] withr_3.0.1       stats_4.4.1       compiler_4.4.1    base_4.4.1       
#> [21] rstudioapi_0.16.0 tools_4.4.1       pillar_1.9.0      evaluate_0.24.0  
#> [25] yaml_2.3.10       rlang_1.1.4       fs_1.6.4          datasets_4.4.1

Created on 2024-08-10 with reprex v2.1.1

Suggested solution

In the default implementation, data gets passed down to vec_math_base() which calls the function from base, and then the output is restored to the input type. I cannot see an issue from vec_math_base() if we allow integer (or even complex) arguments through, but the Math operations on integers do not necessarily return integers (e.g. log(2L) and mean(1:10) both return something that cannot be restored to integer type).

Would there be concerns with removing the vec_restore() from vec_math.default() and allowing all numeric types through?

Finally, I've started using vctrs hands-on to build a class, and it's a great package and I appreciate all the work on it!