ecpolley / SuperLearner

Current version of the SuperLearner R package
272 stars 72 forks source link

SuperLearner also for Time Series? #120

Open RAP1989 opened 5 years ago

RAP1989 commented 5 years ago

Hi,

Since SuperLearner integrates with caret, I tried to search for a possibility to use the resampling method "timeslice" (for time series) instead of cross validation. However, it seems that only CV is possible in SuperLearner and I haven't found any ways for changing it. SuperLearner.control and SuperLearner.CV.Control don't seem to have this kind of possibility, for example.

Have I overlooked something? If the resampling method cannot be changed, are there any plans in the future for implementing timeslice in SuperLearner? Thanks

ecpolley commented 5 years ago

Thanks, I'm not familiar with the timeslice method from caret. Does this create mutually exclusive blocks of observations (rows)? If that is the case, you could use hte validRows list within the SuperLearner.CV.control function to set the data splits.

RAP1989 commented 5 years ago

Thank you so much for the response. Unfortunately, the data splits are not mutually exclusive, as in https://topepo.github.io/caret/data-splitting.html#data-splitting-for-time-series the indices for the train data do overlap.

bm2609 commented 5 years ago

Hello ecpolley, I also want to create blocks of rows of series and I tried the command SuperLearner.CV.control. Lets assume my series has 250 entries (rows) and I want to create blocks from 1:50, 51:100, 101:150, 151:200, and 201:250. So I created a list with 5 entries and where each entry corresponds to the aforementioned sequences. However, after using CV.SuperLearner, I got NA as se for every algorithm in the summary. Did I do something wrong?

ecpolley commented 5 years ago

@bm2609 Does it work with SuperLearner?

CV.SuperLearner is more complicated as it requires specifications of the nested cross-validation folds/

bm2609 commented 5 years ago

Yes, it works with SuperLearner. Thank you!