Closed Steviey closed 2 years ago
Hi,
I'm not sure. Can you give me a reproducible example?
Hi Ivan,
it's highly abstracted tidymodels/modeltime-ensemble-code. I' m currently focusing on state space models, integrated in avg/median/weighted and stacked- ensembles, together with ML-algos and meta-learners.
Basically I try to grid-search parameters of the es-model of smooth. I want to define very very wide search spaces for the key params.
The challenge is, to make sure modeltime communicates/translates unified params, the right way to smooth. So I don't know what is meant by initparams. If I know it, I can search, if there is a chance to tweak it along the tidyverse.
As you know, I have an extensive standalone version of smooth incl. grid search and xRegs already running. The ensemble-approach is another experiment- trying to combine the best performing ML-algos with ets and others programmatically. :-)
` algoTrainCase <-'gridLatin'
ret$modelObj <- exp_smoothing( error = tune() ,trend = tune() ,season = tune() ,damping = tune() ,smooth_level = tune() ,smooth_trend = tune() ) %>% set_engine("smooth_es",holdout=FALSE)
ret$finalParams <- ret$modelObj %>% parameters() #%>% finalize(x=trainPart)
ret$finalParams <- updateMyParams(ret$finalParams,subModel,taskSwitch,trainSetupList) `
initparams()
is an internal function written in C++ to process the input parameters in ETS and create the matrices for the state space model. The user typically doesn't have access to these. If an error occurs with this, then there is either something wrong with the function or with the call to es()
.
Interesting, I will observe it further. In general I can say, smooth-ets incl. xregs works as submodel in a modeltime-ensemble. But I have no idea, if it works as intended :-). I had to implement some fall-back-param-combinations to avoid code-breaks. Measurements are looking 'normal'. I also have to mention, my system ist getting old now (Ubuntu 16.x). So nothing is representative here. :-)
Update: Since I can replecate the error now, it seems to be related to the call to es(). The third-party functions doing this, are here... https://business-science.github.io/modeltime/reference/exp_smoothing.html To avoid the error, we have to choose the right call.
If we choose always a wrong call while training (often random process), we should have one call as 'fall back' which never fails- to avoid code breaks. If we choose anova as training method, we should provide two functional 'fall back calls'.
It can be done by injecting functional param-combinations for the call to es() into the search grid. This way we can iterate day in day out :-).
Another way is, to restrict the params via modeltimes update-functions directly (dials-extensions)- in relation to the individual data structure.
I guess that this then relates to the third-party function, not to es()
. If you have a specific command for es() that produces an error, I can look into it to make it more robust and return something.
Yes, it is a third party problem. I think the calls are constructed, like they would be made by us manually, according to your documentation.
You know some definitions could make no sense, depending on the individual data structure.
Since modeltime-ensemble does not care about failed model fits/builds while training, it will always break the code, regardless of the output of smooth (warnings and errors). This kind of stuff has to be handled by the user.
The text above describes two strategies to get around it while training. :-)
Unfortunately we can only display pre-selected model-definitions (print training grid) and successful definitions (print tune_grid()-export), not explicitly failed definitions (you could do so, in your smooth-error-messages. They will be routed to the screen, via the verbosity-settings of tune::tune_grid()).
The extract-function of tune::tune_grid() provides the model definitions after successful model-fits...
ret$res <- do.call(tune_grid,list( object = trainObj$modelObj # needed a name! ,preprocessor = trainSetupList$recipe_spec ,resamples = foldsEn ,grid = trainObj$finalParamsGrid ,param_info = trainObj$finalParams ,metrics = model_metrics ,control = control_grid(verbose=TRUE,save_pred = FALSE,extract = function (x) extract_fit_parsnip(x)) ))
print(ret$res$.extracts)
There is some more output, I have to investigate, but some models make there way to the ensemble...
Some things I noticed (Part I):
check <- modelInfo %>% names %>% str_detect('damping') %>% any()
if(check==TRUE){
if(!is.character(modelInfo$damping)){
modelInfo <- modelInfo %>% dplyr::rename(smooth_damping=damping)
}
}
Trying to make a grid, that makes sense in regard to both, smooth and modeltime (currently without damping()) ...
` persistenceGrid <- grid_max_entropy( smooth_trend() ,smooth_level() ,smooth_seasonal() ,size = 10 )
smoothEtsGrid <- expand.grid( error=c("auto","additive","multiplicative") ,trend=c("auto","additive","multiplicative","additive_damped","multiplicative_damped","none") ,season=c("auto","additive","multiplicative","none") )
finalGrid <- tidyr::crossing(persistenceGrid,smoothEtsGrid,.namerepair="minimal") finalGrid <- finalGrid %>% dplyr::mutate(modelDef = paste0(error,'',trend,'_',season))
finalGrid <- finalGrid %>% dplyr::mutate(smooth_level = case_when( error=='none' ~ as.numeric(NA) ,error=='auto' ~ as.numeric(NA) ,TRUE ~ smooth_level )) finalGrid <- finalGrid %>% dplyr::mutate(smooth_trend = case_when( trend=='none' ~ as.numeric(NA) ,trend=='auto' ~ as.numeric(NA) ,TRUE ~ smooth_trend )) finalGrid <- finalGrid %>% dplyr::mutate(smooth_seasonal = case_when( season=='none' ~ as.numeric(NA) ,season=='auto' ~ as.numeric(NA) ,TRUE ~ smooth_seasonal ))
print(class(finalGrid)) View(finalGrid) stop() `
Update: dampingsmooth()-fix works...but this is my first PR since the ending of WWII. I forgot too much Github-stuff. ;-).... 'This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.'_ https://github.com/business-science/modeltime/commit/dcda853542da2d8509fee048b50e355e99748ecc Update: Finally pull request established: https://github.com/business-science/modeltime/pulls
Cool stuff! I don't do that often myself :).
Some things I noticed (Part II):
smoothEtsGrid <- expand.grid(
error=c("Z","A","M")
,trend=c("Z","A","M","Ad","Md","N")
,season=c("Z","A","M","N")
)
smoothEtsGrid <- smoothEtsGrid %>% dplyr::mutate(modelDef=paste0(error,trend,season))
...model-test corresponding to individual data...
https://rpubs.com/JiahaoLin/639763
**Note: In case of tidymodels workflows, the actual model can be deeply nested...
myModel<-myModel[['fit']][['fit']][['fit']][['models']][['model_1']]
I don't know if this is intended... If I say: 'no' to holdouts:
model_spec <- exp_smoothing() %>%
set_engine("smooth_es",holdout=FALSE)
... I get this lib. smooth-message, in spite of that: xreg did not contain values for the holdout, so we had to predict missing values.
To avoid specific 'smooth-xRegs-errors', some recipe steps of the tidyverse come in handy:
rec<-rec %>% step_lincomb(all_numeric(), -matches("date|value|id")) %>%
step_nzv(all_numeric(),-all_outcomes(),freq_cut = 70/30, unique_cut = 30) %>%
step_zv(all_predictors(),-matches("date|value|id")) %>%
step_corr(all_numeric(), -all_outcomes(), threshold = 0.95)
- There could be more interesting training-targets, then the smoothing parameters, for example loss-function, sample-size and distribution
_es() uses derivative-free optimisation (Nelder-Mead algorithm). It finds optimum by changing values of parameters in the specific range and minimising a loss function (by default it is negative likelihood). So, there's no need to try out all values. The result is typically fine, although occasionally it's not a proper global minimum. (Ivan Svetunkov)_
With this in mind, I switch to ADAM (for modeltime ensembles), using the experiences with smooth::es() :-).
https://business-science.github.io/modeltime/reference/adam_reg.html
R latest, smooth latest, modeltime latest
Hello,
is this something, I should be aware of?...
Error in initparams(Etype, Ttype, Stype, dataFreq, obsInSample, obsAll, : Not compatible with requested type: [type=character; target=double].