robjhyndman / fpp3package

All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (3rd ed, 2020) by Rob J Hyndman and George Athanasopoulos <http://OTexts.org/fpp3/>. All packages required to run the examples are also loaded.
http://pkg.robjhyndman.com/fpp3package/
132 stars 35 forks source link

Unexpected behaviour with using new_transformation #11

Closed AgentRichi closed 1 year ago

AgentRichi commented 1 year ago

When using a custom transformation as per 13.3 Ensuring forecasts stay within limits and defining an upper and lower limit as variables, the forecast method (and likely other methods too) will look for these variables in the global environment when it is called.

This means that if the variables have changed since the model was fit, for example because another timeseries was fit using a different set of limits, the forecast method will incorrectly scale the forecasted values, without giving a user error/warning.

See example below.

library(fpp3)

scaled_logit <- function(x, lower = 0, upper = 1) {
  log((x - lower) / (upper - x))
}
inv_scaled_logit <- function(x, lower = 0, upper = 1) {
  (upper - lower) * exp(x) / (1 + exp(x)) + lower
}
my_scaled_logit <- new_transformation(
                    scaled_logit, inv_scaled_logit)

egg_prices <- prices |> filter(!is.na(eggs))

lower <- 0
upper <- max(egg_prices$eggs) * 1.1

fit_eggs <- egg_prices |>
  model(
    ETS(my_scaled_logit(eggs, lower = lower, upper = upper)
          ~ trend("A"))
  )

# works as intended
fit_eggs |>
  forecast(h = 50) |>
  autoplot(egg_prices) +
  labs(title = "Annual egg prices",
       y = "$US (in cents adjusted for inflation) ")

# another timeseries, setting a new upper limit
copper_prices <- prices |> filter(!is.na(copper))
upper <- max(copper_prices$copper) * 1.1

fit_copper <- copper_prices |>
  model(
    ETS(my_scaled_logit(copper, lower = lower, upper = upper)
          ~ trend("A"))
  )

# works as intended
fit_copper |>
  forecast(h = 50) |>
  autoplot(copper_prices) +
  labs(title = "Annual copper prices",
       y = "$US (in cents adjusted for inflation) ")

# INCORRECT FORECAST for first fit, but no error/warning
fit_eggs |>
  forecast(h = 50) |>
  autoplot(egg_prices) +
  labs(title = "Annual egg prices",
       y = "$US (in cents adjusted for inflation) ")
mitchelloharawild commented 1 year ago

Thanks for raising this, currently this is the intended behaviour however I can see how this can be surprising. The reason why this is intended is because the transformation parameters are allowed to vary over time. Storing the transformation parameter(s) at the time of estimating the model would not allow them to be updated, so we seek them out when producing a forecast(). The recommended (but rarely used) practice is to keep any transformation parameters inside the dataset you are using, which is useful if you need a different parameter for different series in the same dataset.

This issue is more appropriate for the fabletools package, and I've created an issue referencing this one for you: https://github.com/tidyverts/fabletools/issues/378