Closed njtierney closed 5 years ago
The features()
require names for the function, which probably isn't necessary? Also another issue is that the function doesn't get computed.
library(feasts)
#> Loading required package: fablelite
#>
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#>
#> X11
tsibbledata::aus_retail %>%
features(Turnover, average = ~ mean)
#> # A tibble: 152 x 2
#> State Industry
#> <chr> <chr>
#> 1 Australian Capital Te… Cafes, restaurants and catering services
#> 2 Australian Capital Te… Cafes, restaurants and takeaway food services
#> 3 Australian Capital Te… Clothing retailing
#> 4 Australian Capital Te… Clothing, footwear and personal accessory retail…
#> 5 Australian Capital Te… Department stores
#> 6 Australian Capital Te… Electrical and electronic goods retailing
#> 7 Australian Capital Te… Food retailing
#> 8 Australian Capital Te… Footwear and other personal accessory retailing
#> 9 Australian Capital Te… Furniture, floor coverings, houseware and textil…
#> 10 Australian Capital Te… Hardware, building and garden supplies retailing
#> # … with 142 more rows
Created on 2019-07-08 by the reprex package (v0.3.0)
It doesn't seem clear to me how names are needed or used, since the following:
names(feasts::feat_acf)
#> NULL
feat_acf
doesn't have names?
But I could create a list of funs with names like so, which doesn't work as I might expect it to.
library(feasts)
#> Loading required package: fablelite
#>
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#>
#> X11
funs_list <- list(avg = mean,
sd = sd)
tsibbledata::aus_retail %>%
features(Turnover, funs_list)
#> Error: Argument 1 must have names
Created on 2019-07-08 by the reprex package (v0.2.1)
The latter would be a nice feature to add. The names refer to the object returned by the fn()
.
names(feasts::feat_acf(rnorm(10)))
#> [1] "acf1" "acf10" "diff1_acf1" "diff1_acf10" "diff2_acf1"
#> [6] "diff2_acf10"
Created on 2019-07-08 by the reprex package (v0.3.0)
I agree that names may not be necessary.
@earowang, note that features are passed as a list to the features
argument:
library(fablelite)
tsibbledata::aus_retail %>%
features(Turnover, features = list(~ set_names(mean(.), "mean")))
#> # A tibble: 152 x 2
#> State Industry
#> <chr> <chr>
#> 1 Australian Capital Te… Cafes, restaurants and catering services
#> 2 Australian Capital Te… Cafes, restaurants and takeaway food services
#> 3 Australian Capital Te… Clothing retailing
#> 4 Australian Capital Te… Clothing, footwear and personal accessory retail…
#> 5 Australian Capital Te… Department stores
#> 6 Australian Capital Te… Electrical and electronic goods retailing
#> 7 Australian Capital Te… Food retailing
#> 8 Australian Capital Te… Footwear and other personal accessory retailing
#> 9 Australian Capital Te… Furniture, floor coverings, houseware and textil…
#> 10 Australian Capital Te… Hardware, building and garden supplies retailing
#> # … with 142 more rows
Created on 2019-07-08 by the reprex package (v0.3.0)
So where's the "mean"?
Does this look better?
tsibbledata::aus_retail %>%
features(Turnover, features = list(average = ~ mean(.)))
Good question, works interactively but not in reprex! Hmmm.
edit: something else loaded by load_all
is required, perhaps a namespace issue.
Yes, list(average = ~ mean(.))
is not supported currently, but I think it should be. Working on this now.
The best way (once implemented), would be list(average = mean)
.
Fixed, user error.
library(fablelite)
tsibbledata::aus_retail %>%
features(Turnover, features = list(~ rlang::set_names(mean(.), "mean")))
#> # A tibble: 152 x 3
#> State Industry mean
#> <chr> <chr> <dbl>
#> 1 Australian Capital … Cafes, restaurants and catering services 20.0
#> 2 Australian Capital … Cafes, restaurants and takeaway food services 32.0
#> 3 Australian Capital … Clothing retailing 12.4
#> 4 Australian Capital … Clothing, footwear and personal accessory re… 19.8
#> 5 Australian Capital … Department stores 24.9
#> 6 Australian Capital … Electrical and electronic goods retailing 20.0
#> 7 Australian Capital … Food retailing 97.7
#> 8 Australian Capital … Footwear and other personal accessory retail… 7.38
#> 9 Australian Capital … Furniture, floor coverings, houseware and te… 15.1
#> 10 Australian Capital … Hardware, building and garden supplies retai… 13.0
#> # … with 142 more rows
Created on 2019-07-08 by the reprex package (v0.3.0)
OK, so do you think you will provide a way to construct a feature list? Or do you think it will go to something like this:
tsibbledata::aus_retail %>%
features(Turnover,
features = list(avg = mean))
I think this is easy enough.
library(fablelite)
tsibbledata::aus_retail %>%
features(Turnover, features = list(a = mean, b = feasts::feat_acf))
#> # A tibble: 152 x 10
#> State Industry a b_acf1 b_acf10 b_diff1_acf1 b_diff1_acf10
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Aust… Cafes, … 20.0 0.973 8.59 -0.348 0.239
#> 2 Aust… Cafes, … 32.0 0.977 8.65 -0.327 0.259
#> 3 Aust… Clothin… 12.4 0.885 7.01 -0.276 0.251
#> 4 Aust… Clothin… 19.8 0.846 6.33 -0.303 0.201
#> 5 Aust… Departm… 24.9 0.500 1.60 -0.310 0.202
#> 6 Aust… Electri… 20.0 0.902 7.29 -0.247 0.324
#> 7 Aust… Food re… 97.7 0.984 9.13 -0.394 0.585
#> 8 Aust… Footwea… 7.38 0.760 4.64 -0.325 0.155
#> 9 Aust… Furnitu… 15.1 0.952 7.67 -0.190 0.163
#> 10 Aust… Hardwar… 13.0 0.957 7.67 -0.104 0.101
#> # … with 142 more rows, and 3 more variables: b_diff2_acf1 <dbl>,
#> # b_diff2_acf10 <dbl>, b_season_acf1 <dbl>
Created on 2019-07-08 by the reprex package (v0.3.0)
We also have fablelite::feature_set()
to create a list of features based on tags.
Is this with a new version of feasts/fablelite? I get:
library(fablelite)
tsibbledata::aus_retail %>%
features(Turnover, features = list(a = mean, b = feasts::feat_acf))
#> Error: Argument 1 must have names
Created on 2019-07-08 by the reprex package (v0.2.1)
Yes, new version of fablelite pushed ~5 minutes ago.
Looks great to me:
library(feasts)
#> Loading required package: fablelite
#>
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#>
#> X11
library(brolgar) # using use-feasts branch
wages_ts %>%
features(exper, list(diff_range = ~diff(range(.))))
#> # A tibble: 888 x 2
#> id diff_range
#> <int> <dbl>
#> 1 31 6.97
#> 2 36 9.28
#> 3 53 0.996
#> 4 122 9.04
#> 5 134 10.6
#> 6 145 6.86
#> 7 155 9.45
#> 8 173 6.21
#> 9 206 2.44
#> 10 207 9.78
#> # … with 878 more rows
wages_ts %>%
features(id, list(n_obs = length))
#> # A tibble: 888 x 2
#> id n_obs
#> <int> <int>
#> 1 31 8
#> 2 36 10
#> 3 53 8
#> 4 122 10
#> 5 134 12
#> 6 145 9
#> 7 155 11
#> 8 173 6
#> 9 206 3
#> 10 207 11
#> # … with 878 more rows
# nice naming too
wages_ts %>%
features_at(vars(uerate, exper),
list(avg = mean,
sd = sd))
#> # A tibble: 888 x 5
#> id uerate_avg uerate_sd exper_avg exper_sd
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 31 3.21 0.710 3.38 2.51
#> 2 36 5.10 1.98 4.90 3.32
#> 3 53 4.43 1.34 1.11 0.297
#> 4 122 5.30 1.96 6.42 3.20
#> 5 134 5.72 1.63 5.43 3.59
#> 6 145 5.20 1.79 3.70 2.51
#> 7 155 6.87 3.40 5.84 3.22
#> 8 173 6.08 1.69 3.23 2.54
#> 9 206 8.83 2.37 3.00 1.23
#> 10 207 7.42 2.17 5.55 3.27
#> # … with 878 more rows
# nice naming too
wages_ts %>%
features_at(tsibble::measured_vars(.),
list(avg = mean,
sd = sd))
#> # A tibble: 888 x 15
#> id lnw_avg lnw_sd ged_avg ged_sd postexp_avg postexp_sd black_avg
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 31 1.75 0.277 1 0 3.38 2.51 0
#> 2 36 2.33 0.387 1 0 4.90 3.32 0
#> 3 53 1.89 0.562 0.75 0.463 0.172 0.274 0
#> 4 122 2.17 0.574 0 0 0 0 0
#> 5 134 2.48 0.321 0.667 0.492 2.31 2.65 0
#> 6 145 1.76 0.185 0.889 0.333 3.36 2.49 0
#> 7 155 2.17 0.362 0 0 0 0 0
#> 8 173 1.93 0.274 0 0 0 0 0
#> 9 206 2.27 0.228 0 0 0 0 0
#> 10 207 2.11 0.327 0 0 0 0 0
#> # … with 878 more rows, and 7 more variables: black_sd <dbl>,
#> # hispanic_avg <dbl>, hispanic_sd <dbl>, hgc_avg <dbl>, hgc_sd <dbl>,
#> # uerate_avg <dbl>, uerate_sd <dbl>
Created on 2019-07-08 by the reprex package (v0.2.1)
This is so great, it will involve re-writing many functions in brolgar
, but this flexibility is wonderful.
Here's my crack at adding do your own features:
To create your own features or summaries to pass to `feasts`, you can provide a named list of functions. For example:
```{r create-three}
library(feasts)
feat_three <- list(min = min,
med = median,
max = max)
feat_three
These are then passed to features
like so:
library(tsibbledata)
aus_retail %>%
features(Turnover, feat_three)
Somewhat related, I've added a question about `feature_set` here https://github.com/tidyverts/fablelite/issues/89
Sounds good. This is the recommended interface for users.
Hello!
I'm interested in writing my own functions to pass to
feasts
, but I feel like I am missing something on how to create the function I want to pass tofeatures
. Here's a reprex:Created on 2019-07-05 by the reprex package (v0.2.1)
I am probably missing something, but perhaps it might be useful to have some helpers around creating/validating feature functions? Perhaps something like:
validate_feature
to check if a function can be passed tofeature
new_feature
to create a new featureOnce I understand this I'd be happy to contribute a vignette or something to explain how to create new features, if you like?