Open geraldine28 opened 2 years ago
You can define your target variable (y) and predictors (x) separately.
dat <- as.matrix(dat)
drop.col = -c(3)
model <- train(
y = dat$y,
x = dat[, drop.col]
tuneLength = 5,
data = dat,
method = "ranger",
trControl = trainControl(
method = "timeslice",
initialWindow = 5,
horizon = 2,
allowParallel = TRUE,
verboseIter = TRUE,
seeds = NULL
),
metric = "RMSE"
)
NOTE: id needs to be encoded as factor using e.g., factor()
I am new to caret and have a beginner's question regarding the 'timeslice' argument in caret's 'train' function.
I originally have a balanced panel data set with 22 years and 37,442 unique cross-sectional observations. Here is an example data set to exemplify the structure of the data
I tried to use 'train' to run a simple random forest model on the data with a fixed time window of 5 years and a horizon of 2 years:
However, this gives the following error:
I presume this error occurs because the data is not a time series but a longitudinal data set. So my question is how this can be handled with 'timeslice'?