srlanalytics / bdfm

Bayesian dynamic factor model estimation and predictive statistics including nowcasting and forecasting
MIT License
5 stars 6 forks source link

Exploding Forecast of T10Y3M #56

Closed christophsax closed 5 years ago

christophsax commented 5 years ago
library(bdfm)
m <- dfm(data = econ_us, factors = 3, pre_differenced = "A191RL1Q225SBEA", store_idx = "A191RL1Q225SBEA")
tsbox::ts_plot(predict(m)[, 'T10Y3M'])

Created on 2019-03-24 by the reprex package (v0.2.1)

christophsax commented 5 years ago

I guess this comes back to the usual problem of using non-stationary series in the BDFM. It seems that your diff algo is not keen enough about differentiating.

Original Manual Log Diff Choice

From your original example:

logs0 <- c(
  "W068RCQ027SBEA",
  "PCEDG",
  "PCEND",
  "JTSJOL",
  "INDPRO",
  "CSUSHPINSA",
  "HSN1F",
  "TSIFRGHT",
  "IPG2211S",
  "DGORDER",
  "AMTMNO",
  "CPILFESL",
  "ICSA"
)

diffs0 <- c(
  "W068RCQ027SBEA",
  "PCEDG",
  "PCEND",
  "UMCSENT",
  "UNRATE",
  "JTSJOL",
  "INDPRO",
  "CSUSHPINSA",
  "HSN1F",
  "TSIFRGHT",
  "FRGSHPUSM649NCIS",
  "CAPUTLG2211S",
  "IPG2211S",
  "DGORDER",
  "AMTMNO",
  "MNFCTRIRSA",
  "RETAILIRSA",
  "WHLSLRIRSA",
  "CPILFESL",
  "ICSA",
  "TWEXB",
  "T10Y3M"
)

library(bdfm)
m <- dfm(data = econ_us, logs = logs0, diffs = diffs0, factors = 3, pre_differenced = "A191RL1Q225SBEA", store_idx = "A191RL1Q225SBEA")

Which leads to a nice forecast:

tsbox::ts_plot(predict(m)[, 'T10Y3M'])

Automatic Log Diff Choice

However, with the automatic, choice the forecasts goes nuts:

m <- dfm(data = econ_us, factors = 3, pre_differenced = "A191RL1Q225SBEA", store_idx = "A191RL1Q225SBEA")
tsbox::ts_plot(predict(m)[, 'T10Y3M'])

Same if we add logs and diffs manually (so technically, this works as advertised):

logs = c(
  "W068RCQ027SBEA",
  "PCEND",
  "CSUSHPINSA"
)
diffs = c(
  "W068RCQ027SBEA",
  "PCEDG",
  "PCEND",
  "UNRATE",
  "JTSJOL",
  "INDPRO",
  "CSUSHPINSA",
  "TSIFRGHT",
  "IPG2211S",
  "AMTMNO",
  "RETAILIRSA",
  "TWEXB"
)

m <- dfm(data = econ_us, logs = logs, diffs = diffs, factors = 3, pre_differenced = "A191RL1Q225SBEA", store_idx = "A191RL1Q225SBEA")
tsbox::ts_plot(predict(m)[, 'T10Y3M'])

Auto diff log has all the manual choices but manual has much more. It seems we simply don’t log and diff enough:

setdiff(logs, logs0)
#> character(0)
setdiff(logs0, logs)
#>  [1] "PCEDG"    "JTSJOL"   "INDPRO"   "HSN1F"    "TSIFRGHT" "IPG2211S"
#>  [7] "DGORDER"  "AMTMNO"   "CPILFESL" "ICSA"

setdiff(diffs, diffs0)
#> character(0)
setdiff(diffs0, diffs)
#>  [1] "UMCSENT"          "HSN1F"            "FRGSHPUSM649NCIS"
#>  [4] "CAPUTLG2211S"     "DGORDER"          "MNFCTRIRSA"      
#>  [7] "WHLSLRIRSA"       "CPILFESL"         "ICSA"            
#> [10] "T10Y3M"

Created on 2019-03-24 by the reprex package (v0.2.1)

srlanalytics commented 5 years ago

I turned the rule for automatic differencing up to require the estimated AR(1) coefficient must be three standard deviations less than one (one being a random walk and therefore not stationary). That seems very aggressive but is actually still less strict than the difference we specified manually, so it's probably fine. It should certainly err on the side of stationary results.

christophsax commented 5 years ago

This is resolved

library(bdfm)
m <- dfm(data = econ_us, factors = 3, pre_differenced = "A191RL1Q225SBEA", keep_posterior = "A191RL1Q225SBEA")

tsbox::ts_plot(predict(m)[, 'T10Y3M'])

Created on 2019-06-15 by the reprex package (v0.2.1)