config-i1 / smooth

The set of functions used for time series analysis and in forecasting.
89 stars 19 forks source link

data-dependent effect on dimnames (tiymodels+ADAM)? #197

Closed Steviey closed 2 years ago

Steviey commented 2 years ago

tidymodels, not native

I know I have seen this error a couple of times. But I can't remember how I fixed it.... It seems to have something to do with line 780 in...

https://github.com/config-i1/smooth/blob/master/R/adam.R

How can I check the length of the components?

Error in matrix(NA, componentsNumberETS + componentsNumberARIMA + xregNumber +  : 
length of 'dimnames' [1] not equal to array extent

Error in matrix(NA, componentsNumberETS + componentsNumberARIMA + xregNumber +  : 
  length of 'dimnames' [1] not equal to array extent
In addition: There were 14 warnings (use warnings() to see them)
     ▆
  1. ├─base::do.call(doModelPlots, list()) at R/adthocStrat/ensShell.R:297:0
  2. ├─global `<fn>`()
  3. │ └─wflw %>% fit(allData) at R/adthocStrat/ensShell.R:284:8
  4. ├─generics::fit(., allData)
  5. ├─workflows:::fit.workflow(., allData)
  6. │ └─workflows::.fit_model(workflow, control)
  7. │   ├─generics::fit(action_model, workflow = workflow, control = control)
  8. │   └─workflows:::fit.action_model(...)
  9. │     └─workflows:::fit_from_xy(spec, mold, control_parsnip)
 10. │       ├─generics::fit_xy(...)
 11. │       └─parsnip::fit_xy.model_spec(...)
 12. │         └─parsnip:::xy_xy(...)
 13. │           ├─base::system.time(...)
 14. │           └─parsnip:::eval_mod(...)
 15. │             └─rlang::eval_tidy(e, ...)
 16. ├─modeltime::adam_fit_impl(...)
 17. │ └─rlang::eval_tidy(fit_call)
 18. ├─smooth::adam(...)
 19. │ └─smooth estimator(...)
 20. │   └─smooth creator(...)
 21. │     └─base::matrix(...)
 22. └─global `<fn>`()
 23.   └─lobstr::cst() 

Someone remember?

https://stackoverflow.com/questions/12985653/what-does-length-of-dimnames-1-not-equal-to-array-extent-mean

Noticed 1: The error disappears, if I modify the threshold for correlating predictors.... step_corr(all_numeric(), -all_outcomes()"), threshold = 0.10) ... This reduces the amount of predictors via correlation measurements.

This leads me to the assumption, that this is related to input-data. So propably no real bug. I can't provide live-data. And simulating will not guarantee a reproduce able example in this case. :-)

Just for interest...

Noticed 2: The parameters to reproduce 'the effect' with individual data were:

[1] "ANN" [1] "likelihood" [1] "ds" corr-threshold = 0.5 the test consist of 21 predictors all data was numeric and median-imputed
... I will try to drill it down, to a specific predictor....

Result: The effect is reproduce-able with a normalized data-column called: "n2Norm" (predictor).
Rename it to 'myNormalizedVar' will let the code run through (strange but true :-)).

config-i1 commented 2 years ago

Please provide simple reproducible example. I cannot do anything unless I can reproduce it. There can be many reasons why that error appeared, pointing to the specific line doesn't help.

Steviey commented 2 years ago

Update: I will provide a native-test with that result in mind- as soon as possible. It's not tidymodels. I can only reproduce it with live-data. Very strange effect. More of, what does this has to do with colnames from above? Could be an indicator of multiple causes.

            idx=120
            y   <-rnorm(idx, mean=15, sd=5)
            x1  <-rnorm(idx, mean=15, sd=5)
            x2  <-rnorm(idx, mean=15, sd=5)
            x3  <-seq(0,1,0.1)
            x3  <-rep(x3,idx)
            data    <- data.frame(n2=x1,week=x2,n2Norm=x3,value=y,stringsAsFactors=F)
            data    <- data %>% dplyr::relocate(value)
            dataLength      <-nrow(data)

            #-------------orig. live-data (same type, probably other distribution)
            data1        <- as.data.frame(allData)
            data1        <- tail(data1,n=120)
            #-------------orig. live-data (same type, probably other distribution)

            colVect     <- c('value','n2','week','n2Norm')
            data        <- data %>% dplyr::select(any_of(colVect))

            data        <- tail(data,n=120)

            data[['n2Norm']]<-data1[['n2Norm']]
            data[['week']]<-data1[['week']]
            #data[['n2']]<-data1[['n2']] <-model will not be fitted if uncommented
            data[['value']]<-data1[['value']]

            myModel     <- adam(data,"ANN",silent=TRUE,h=1,holdout=FALSE)

image

Ideas: