rstudio / keras3

R Interface to Keras
https://keras3.posit.co/
Other
831 stars 282 forks source link

Question: Neural Networks for time-series regression #950

Open AlexSiormpas opened 4 years ago

AlexSiormpas commented 4 years ago

I'm learning Keras in R and want to test a NN regression model on a multivariate time series data set. The goal is to predict Y as a function of its lag and all the other X variables and their corresponding lags. The dataset has ~250 variables and ~4000 daily observations in the following format:

Y   Y-1day  Y-2days .. Y-50days   X1-1day .. X1-2days ...  X5-50days
...................................................................
56  49      42         98         49      .. 134      ...  345
78  56      49         102        67      .. 155      ...  497
89  78      56         134        88      .. 161      ...  412
...................................................................

As a 1st step, I split the dataset between Predictors (Y-1day...X5-50days) and Predict (only Y) matrixes. Then I normalize [min-max] Predictors between 0-1. Finally, I pass the 2 matrixes to Keras model with the implemented code below:

#Create the model
build_model <- function() {
    model <- keras_model_sequential() %>%
        layer_dense(units = 5, activation = "relu", input_shape = dim(Predictors)[2]) %>%
        layer_dense(units = 5, activation = "relu") %>%
        layer_dense(units = 1)
    model %>% compile(
        loss = "mean_absolute_percentage_error",
        optimizer = optimizer_rmsprop(),
        metrics = list("mean_absolute_percentage_error")
    )model}
model <- build_model()
model %>% summary()

#Fit and train the model
print_dot_callback <- callback_lambda(
    on_epoch_end = function(epoch, logs) {
        if (epoch %% 80 == 0) cat("\n")
        cat(".")})

epochs <- 10000
early_stop <- callback_early_stopping(monitor = "val_loss", patience = 500)
model <- build_model()
history <- model %>% fit(
    Predictors,
    Predict,
    epochs = epochs,
    validation_split = 0.3,
    verbose = 0,
    callbacks = list(early_stop, print_dot_callback))
plot(history, metrics = "mean_absolute_percentage_error", smooth = FALSE) + coord_cartesian(xlim = c(0, 10000), ylim = c(0, 100))

After training is done, the last code line plots MAPE error for training and validation as shown below. We can notice that the MAPE on validation decreased down to ~1.3% which looks great. However, I'm am a bit skeptical as I don't see validation loss to increase, despite the high number of epochs. Does this indicate a hidden error or overlooked implementation in my model? How shall I interpret this? 20G3h

nba2020 commented 4 years ago

Would any more clarification info help? :)

atroiano commented 4 years ago

Not sure I totally understand the dataset, if you can provide an example I might be able to give you more guidance.

That being said, if your observations contain features that are in other observations, you probably want to train the data without shuffling otherwise you are leaking information.

Ideally, you'd want to do the final evaluation on the 3rd set of data that is not used in training or testing. If that still looks good, test it against production data and see how it performs with truly new data (if that's an option).