tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

Error: Problem with `mutate()` column `ols`. ℹ `ols = (function (object, ...) ...`. x variable names are limited to 10000 bytes #336

Open PSQR opened 2 years ago

PSQR commented 2 years ago

Hello,

I am receiving an error when using the reconcile() function. I am performing hierarchical forecasting on a data set with many hierarchical levels.

tic()

fit <- EMEA_shipments_08 %>%
  model(arima = ARIMA(ZCOPASMG))

toc()

tic()

fc <- fit %>%
  reconcile(
    ols = min_trace(arima, method="ols")
    #wlsv = min_trace(arima, method="wls_var"),
    #wlss = min_trace(arima, method="wls_struct"),
    #mint_c = min_trace(ets, method="mint_cov"),
    #mint_s = min_trace(arima, method="mint_shrink")
  ) %>%
  forecast(h = "2 years")

Error: Problem with `mutate()` column `ols`.
ℹ `ols = (function (object, ...) ...`.
x variable names are limited to 10000 bytes
Run `rlang::last_error()` to see where the error occurred.

I suspect this error is occurring because I have 7 columns in my hierarchy, with thousands of of levels for these columns. I know this is not reproducible code, but I think the description of my problem is clear.

Is there a maximum size to a hierarchy that the reconcile function is intended to work with? Is it possible that the source code will be changed to accommodate larger hierarchies?

I am curious to learn more about this topic, and I love the fable package and its functionality.

mitchelloharawild commented 2 years ago

This is odd, and without your data it is tricky to determine why this is happening. The issue looks to be that the column name being created is containing a lot of characters (likely the contents of an object, rather than the name?)

If you're able to reproduce this with some dummy data you can provide, that would be very helpful.

PSQR commented 2 years ago

Here is the full context of the error. I am going to try to re-create with dummy data today.


rlang::last_error()
<error/dplyr:::mutate_error>
Problem with `mutate()` column `ols`.
ℹ `ols = (function (object, ...) ...`.
x variable names are limited to 10000 bytes
Backtrace:
  1. fabletools::reconcile(., ols = min_trace(arima, method = "ols"))
  2. fabletools::forecast(., h = "2 years")
 39. dplyr:::h(simpleError(msg, call))
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/dplyr:::mutate_error>
Problem with `mutate()` column `ols`.
ℹ `ols = (function (object, ...) ...`.
x variable names are limited to 10000 bytes
Backtrace:
     █
  1. ├─`%>%`(...)
  2. │ ├─base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
  3. │ └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
  4. │   └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
  5. │     └─`_fseq`(`_lhs`)
  6. │       └─magrittr::freduce(value, `_function_list`)
  7. │         ├─base::withVisible(function_list[[k]](value))
  8. │         └─function_list[[k]](value)
  9. │           ├─fabletools::forecast(., h = "2 years")
 10. │           └─fabletools:::forecast.mdl_df(., h = "2 years")
 11. │             └─dplyr::mutate_at(...)
 12. │               ├─dplyr::mutate(.tbl, !!!funs)
 13. │               └─dplyr:::mutate.data.frame(.tbl, !!!funs)
 14. │                 └─dplyr:::mutate_cols(.data, ..., caller_env = caller_env())
 15. │                   ├─base::withCallingHandlers(...)
 16. │                   └─mask$eval_all_mutate(quo)
 17. ├─(function (object, ...) ...
 18. ├─fabletools:::forecast.lst_mint_mdl(...)
 19. │ ├─base::unname(...)
 20. │ ├─base::as.matrix(...)
 21. │ └─fabletools:::reduce(res, full_join, by = index_var(res[[1]]))
 22. │   └─base::Reduce(f, .x, init = .init)
 23. │     └─fabletools:::f(init, x[[i]])
 24. │       ├─dplyr:::.f(x, y, ...)
 25. │       └─dplyr:::full_join.data.frame(x, y, ...)
 26. │         └─dplyr:::join_mutate(...)
 27. │           └─dplyr::dplyr_reconstruct(out, x)
 28. │             ├─dplyr:::dplyr_reconstruct_dispatch(data, template)
 29. │             └─tsibble:::dplyr_reconstruct.tbl_ts(data, template)
 30. │               └─tsibble:::update_meta(...)
 31. │                 └─tsibble:::retain_tsibble(new, key = key_vars(old), index = index(old))
 32. │                   └─tsibble:::duplicated_key_index(data, key, index)
 33. │                     ├─dplyr::summarise(keyed_data, `:=`(!!"zzz", vec_duplicate_any(!!sym(index))))
 34. │                     └─dplyr:::summarise.data.frame(keyed_data, `:=`(!!"zzz", vec_duplicate_any(!!sym(index))))
 35. │                       └─dplyr:::summarise_cols(.data, ..., caller_env = caller_env())
 36. │                         └─DataMask$new(.data, caller_env)
 37. │                           └─.subset2(public_bind_env, "initialize")(...)
 38. └─base::.handleSimpleError(...)
 39.   └─dplyr:::h(simpleError(msg, call))
<error/simpleError>
variable names are limited to 10000 bytes