Closed earowang closed 5 years ago
Moved to fablelite, where features and tagging are implemented.
Features are registered using register_feature()
, which allows users to bind features from their global environment.
https://github.com/tidyverts/feasts/blob/98d906d0d679d2f18d92d3548ea4e32160a362c7/R/zzz.R#L7-L9
This usage is supported, although I doubt it will be commonly used.
how features_set()
used?
library(feasts)
#> Loading required package: fablelite
#>
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#>
#> X11
as_tsibble(USAccDeaths) %>%
features(log(value), feature_set(tags = "autocorrelation"))
#> # A tibble: 1 x 11
#> x_acf1 x_acf10 diff1_acf1 diff1_acf10 diff2_acf1 diff2_acf10 seas_acf1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.698 1.22 0.00613 0.280 -0.494 0.781 0.642
#> # … with 4 more variables: x_pacf5 <dbl>, diff1x_pacf5 <dbl>,
#> # diff2x_pacf5 <dbl>, seas_pacf <dbl>
as_tsibble(USAccDeaths) %>%
features(log(value), feature_set(package = "feasts"))
#> # A tibble: 1 x 18
#> trend_strength seasonal_streng… spike linearity curvature
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.794 0.944 1.61e-10 -0.228 0.304
#> # … with 13 more variables: seasonal_peak_year <dbl>,
#> # seasonal_trough_year <dbl>, x_acf1 <dbl>, x_acf10 <dbl>,
#> # diff1_acf1 <dbl>, diff1_acf10 <dbl>, diff2_acf1 <dbl>,
#> # diff2_acf10 <dbl>, seas_acf1 <dbl>, x_pacf5 <dbl>, diff1x_pacf5 <dbl>,
#> # diff2x_pacf5 <dbl>, seas_pacf <dbl>
Created on 2019-06-09 by the reprex package (v0.2.1)
Did you say can be multiple packages? Can we rename package
to pkgs
instead?
What does x_
prefix indicate?
Yes, multiple packages are supported - I can rename this arg.
x_
prefix is defined in acf_features
and pacf_features
from the tsfeatures
package. It is used to differentiate ACF values on the data (x
), the first differences (diff1
) and the seasonal differences (seas
).
Not my choice - happy to change.
Without x_
, these names are still unique, aren't they?
Correct.
Can we remove the prefix then
Done. There are many unusual choices made in individual feature functions.
Can I use log()
in scoped variants?
as_tsibble(USAccDeaths) %>%
features_at(log(value), feature_set(tags = "autocorrelation"))
Nope. Scoped variants use tidyselect semantics, as is similar with summarise_at
.
But you'd like to keep log()
in features()
?
Yes - it's very useful for quickly exploring your data.
For example, the workflow for identifying the differences for making a stationary time series computing features on various differences.
I'm okay with it, just reminding the inconsistency. Can you print out a list of output names using feat_available()
, maybe in a separate issue to check if names are okay?
Have a look at the docs for feature_set
when feasts
is loaded.
I think it's better to keep this functionality in the docs, as it will allow better linking to the feature's documentation.
I'm planning on adding docs for features_by_package
and features_by_tag
, which will be linked to via features()
and feature_set()
. Currently only features_by_package
is implemented, and is shown in the ?feature_set
docs under the "Features" section.
For your easy viewing:
Use feat_*
or features_*
?
feat_*
, haven't changed yet.
I'm talking about the feature names not function names.
I'm confused by what you mean here.
I mean output names
#> # A tibble: 1 x 18
#> trend_strength seasonal_streng… spike linearity curvature
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.794 0.944 1.61e-10 -0.228 0.304
#> # … with 13 more variables: seasonal_peak_year <dbl>,
#> # seasonal_trough_year <dbl>, x_acf1 <dbl>, x_acf10 <dbl>,
#> # diff1_acf1 <dbl>, diff1_acf10 <dbl>, diff2_acf1 <dbl>,
#> # diff2_acf10 <dbl>, seas_acf1 <dbl>, x_pacf5 <dbl>, diff1x_pacf5 <dbl>,
#> # diff2x_pacf5 <dbl>, seas_pacf <dbl>
Prefix everything by feat_
? Why? Seems far too verbose.
No, that was talking about the functions
If people want to add prefixes, they can do so using the list names.
library(feasts)
#> Loading required package: fablelite
#>
#> Attaching package: 'feasts'
#> The following object is masked from 'package:grDevices':
#>
#> X11
as_tsibble(USAccDeaths) %>%
features(log(value), list(feat = features_stl))
#> # A tibble: 1 x 7
#> feat_trend_stre… feat_seasonal_s… feat_spike feat_linearity
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0.794 0.944 1.61e-10 -0.228
#> # … with 3 more variables: feat_curvature <dbl>,
#> # feat_seasonal_peak_year <dbl>, feat_seasonal_trough_year <dbl>
Created on 2019-06-09 by the reprex package (v0.2.1)
I'd like see the default column names for all available features.
I'm closing this.
Do you mean as a user, or for thinking about the problem now as a developer?
If needed for the user, I think it should be detailed in ?features_stl
. If you need it, I'd look through the final line of each function in https://github.com/tidyverts/feasts/blob/master/R/features.R and https://github.com/tidyverts/feasts/blob/master/R/hctsa_features.R
I mean general: output names are informative or not.
For example arch_lm
-> rsquared_arch
.
We don't have feat_all()
to obtain all the features?
feat_all == feature_set
, as the feature_set(pkgs = NULL, tags = NULL)
is default.
I mean general: output names are informative or not.
For example
arch_lm
->rsquared_arch
.We don't have
feat_all()
to obtain all the features?
Regarding arch_lm
, I've renamed it to stat_arch_lm
. I don't think rsquared_arch
is appropriate as the rsquared
is more about implementation. It is a statistic for the LM test for ARCH.
As discussed on Friday, we'll have a features tagging system. But now I'm thinking the interface should be
features_set(tags, envs)
.Features live in not only packages but also global environment where users defined their own feature functions.