tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

features method failing with .var being a character string #385

Closed leonfernandes closed 4 months ago

leonfernandes commented 11 months ago

As demonstrated in the reprex below, the features.tbl_ts() fails when .var is a character string. Wrapping this function with a variable column name is a problem.

library(fabletools)
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
head(tourism, 160) |>
    features(Trips, features = list(mean = mean, sd = sd))
#> # A tibble: 2 × 5
#>   Region   State           Purpose   mean    sd
#>   <chr>    <chr>           <chr>    <dbl> <dbl>
#> 1 Adelaide South Australia Business  156.  35.6
#> 2 Adelaide South Australia Holiday   157.  27.1
head(tourism, 160) |>
    features("Trips", features = list(mean = mean, sd = sd))
#> Warning in mean.default(...): argument is not numeric or logical: returning NA

#> Warning in mean.default(...): argument is not numeric or logical: returning NA
#> Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
#> na.rm): NAs introduced by coercion

#> Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
#> na.rm): NAs introduced by coercion
#> # A tibble: 2 × 5
#>   Region   State           Purpose   mean    sd
#>   <chr>    <chr>           <chr>    <dbl> <dbl>
#> 1 Adelaide South Australia Business    NA    NA
#> 2 Adelaide South Australia Holiday     NA    NA
foo <- function(df, col_name) {
    df |>
        features(!!rlang::enquo(col_name), features = list(mean = mean, sd = sd))
}
head(tourism, 160) |>
    foo(Trips)
#> # A tibble: 2 × 5
#>   Region   State           Purpose   mean    sd
#>   <chr>    <chr>           <chr>    <dbl> <dbl>
#> 1 Adelaide South Australia Business  156.  35.6
#> 2 Adelaide South Australia Holiday   157.  27.1
head(tourism, 160) |>
    foo("Trips")
#> Warning in mean.default(...): argument is not numeric or logical: returning NA
#> Warning in mean.default(...): argument is not numeric or logical: returning NA
#> Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
#> na.rm): NAs introduced by coercion

#> Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
#> na.rm): NAs introduced by coercion
#> # A tibble: 2 × 5
#>   Region   State           Purpose   mean    sd
#>   <chr>    <chr>           <chr>    <dbl> <dbl>
#> 1 Adelaide South Australia Business    NA    NA
#> 2 Adelaide South Australia Holiday     NA    NA

Created on 2023-07-21 with reprex v2.0.2

mitchelloharawild commented 4 months ago

features() allows for calculation of features from arbitrary expressions, and so "Trips" is handled as a length 1 character vector. To select columns with tidyselect you can use features_at().

E.g.

library(fabletools)
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
head(tourism, 160) |>
  features(Trips, features = list(mean = mean, sd = sd))
#> # A tibble: 2 x 5
#>   Region   State           Purpose   mean    sd
#>   <chr>    <chr>           <chr>    <dbl> <dbl>
#> 1 Adelaide South Australia Business  156.  35.6
#> 2 Adelaide South Australia Holiday   157.  27.1
head(tourism, 160) |>
  features_at("Trips", features = list(mean = mean, sd = sd))
#> # A tibble: 2 x 5
#>   Region   State           Purpose  Trips_mean Trips_sd
#>   <chr>    <chr>           <chr>         <dbl>    <dbl>
#> 1 Adelaide South Australia Business       156.     35.6
#> 2 Adelaide South Australia Holiday        157.     27.1

Created on 2024-03-02 with reprex v2.0.2