Closed wkdavis closed 3 years ago
The response variable should match the name of the response variable for your model (not the point forecasts). You are correct that it seems a bit redundant at the moment, and may be removed in the near future. The response variable name will likely be stored as part of the distribution object, or we may require that the column name for the distributions matches the response variable name.
The new behaviour is that specifying a response variable here will update the response variable in the distributions. This shouldn't be an issue anymore:
library(tsibbledata)
library(tsibble)
#>
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
library(fable)
#> Loading required package: fabletools
library(fabletools)
aus <- tsibbledata::hh_budget
fit <- fabletools::model(aus, ARIMA = ARIMA(Debt))
fc_tsibble <- fit %>%
fabletools::forecast(., h = 2) %>%
as_tibble(.) %>%
tsibble::as_tsibble(., key = c(Country, .model), index = Year)
fc_tsibble
#> # A tsibble: 8 x 5 [1Y]
#> # Key: Country, .model [4]
#> Country .model Year Debt .mean
#> <chr> <chr> <dbl> <dist> <dbl>
#> 1 Australia ARIMA 2017 N(215, 21) 215.
#> 2 Australia ARIMA 2018 N(221, 63) 221.
#> 3 Canada ARIMA 2017 N(188, 7) 188.
#> 4 Canada ARIMA 2018 N(192, 21) 192.
#> 5 Japan ARIMA 2017 N(106, 3.8) 106.
#> 6 Japan ARIMA 2018 N(106, 7.6) 106.
#> 7 USA ARIMA 2017 N(109, 11) 109.
#> 8 USA ARIMA 2018 N(110, 29) 110.
as_fable(fc_tsibble, response = ".mean", distribution = Debt)
#> # A fable: 8 x 5 [1Y]
#> # Key: Country, .model [4]
#> Country .model Year Debt .mean
#> <chr> <chr> <dbl> <dist> <dbl>
#> 1 Australia ARIMA 2017 N(215, 21) 215.
#> 2 Australia ARIMA 2018 N(221, 63) 221.
#> 3 Canada ARIMA 2017 N(188, 7) 188.
#> 4 Canada ARIMA 2018 N(192, 21) 192.
#> 5 Japan ARIMA 2017 N(106, 3.8) 106.
#> 6 Japan ARIMA 2018 N(106, 7.6) 106.
#> 7 USA ARIMA 2017 N(109, 11) 109.
#> 8 USA ARIMA 2018 N(110, 29) 110.
Created on 2021-01-08 by the reprex package (v0.3.0)
@mitchelloharawild Thanks!
No worries.
To clarify for future readers, there is a right and wrong response variable that you need to specify. The value for as_fable(response = <chr>)
should match the name(s) of the response variables from your data.
So if you are using the above dataset, and predicting Debt
(with ARIMA(Debt)
in this case), then you should use as_fable(..., response = "Debt")
.
Based on a SO question. I believe the error messages in
as_fable()
could be a bit more informative for some cases. In this example you get an error saying that the column must be type<distribution>
but instead it has type<distribution>
.Created on 2020-09-28 by the reprex package (v0.3.0)
In reality, or at least in my experience, the issue is that the
response
variable is set to the.mean
column when it should be set to the distribution column (Debt
in this case).Created on 2020-09-28 by the reprex package (v0.3.0)
I think between the error message and the documentation it's not clear why the
response
anddistribution
arguments should be set to the same column. More broadly, if theresponse
variable must be a distribution (per the error message), then what is the difference between the 2 arguments, and/or in what case would they ever be different columns?