Closed swaheera closed 3 years ago
Hello swaheera,
thank you very much for your feedback and your interest into the echos package.
I took a closer look at your code and made a few adjustments. In some places I wasn't sure what exactly you want to do, so I just ran through the code the way I think it should be done.
Notes to your case:
And here's the adjusted code:
# Load packages
library(echos)
library(tscv)
library(dplyr)
library(tsibble)
library(fabletools)
library(fable)
Sys.setlocale("LC_TIME", "C")
# Create artificial daily time series data
date <- seq(
from = as.Date("1800-01-01"),
to = as.Date("2020-01-01"),
by = "day")
damage <- rnorm(
n = 80354,
mean = 100,
sd = 10)
# Prepare daily time series data as tsibble
data <- tsibble(
date = date,
damage = damage,
index = date) %>%
pivot_longer(
cols = damage,
names_to = "Variable",
values_to = "Value")
# Aggregate daily to monthly time series data
data <- data %>%
group_by_key() %>%
index_by(year_month = ~ yearmonth(.)) %>%
summarise(Value = sum(Value, na.rm = TRUE)) %>%
rename(Date = year_month) %>%
select(Date, Variable, Value)
data
# Example proceeds with monthly time series data
# Setup for time series cross validation
n_init <- 200 # size for training window
n_ahead <- 12 # size for testing window (i.e. 12 month ahead forecast)
mode <- "slide" # fixed window approach (for such a long time series this is fine)
n_skip <- 11 # skip 11 month between windows
n_lag <- 0 # no lag
data <- data %>%
tscv::split_data(
n_init = n_init,
n_ahead = n_ahead,
mode = mode,
n_skip = n_skip,
n_lag = n_lag)
# For this example, only the first split is considered
data <- data %>%
filter(split == 1)
models <- data %>%
filter(sample == "train") %>% # only training data should be used for model training
model(
"ESN" = ESN(
Value,
inf_crit = "bic", # here's the error: the argument changed from upper case to lower case ("BIC" -> "bic") and forgot to adjust the README
max_lag = 3,
n_initial = 50,
n_res = 200,
scale_inputs = c(-1, 1)),
"Naive" = NAIVE(Value))
# Detailed report of ESN
models %>%
select(ESN) %>%
report()
# Forecast models
fcsts <- models %>%
forecast(h = n_ahead)
fcsts
# Visualize the forecast and the actual values
actuals <- data %>%
filter_index("1816 Jan" ~ .)
fcsts %>%
autoplot(
actuals,
level = NULL,
size = 1)
I hope I could help you a little bit with the code above.
Disclaimer: The echos package is highly experimental and it is very likely that there will be (substantial) changes in the near future. These changes will probably affect the interface (e.g. arguments within ESN()) and the underlying modeling procedure. Just as a friendly warning ;)
I wish you a Merry Christmas and a Happy New Year!
Best, Alex
Sorry, the issue was closed. I reopened it if you have any further questions.
Best, Alex
Hallo Dr. Haeusser! Danke für Ihre Antwort!
It is very kind of you to reply, thank you. Your new code runs perfectly! I would suggest that you add "library(tidyverse)", so that the pivot_longer() function can be loaded.
Just to clarify, in this picture above: the red line corresponding to the predictions made by the ESN ....the ESN has not "seen" data slightly after July 1816 - July 1817, is this correct?
How has your experience been using neural networks (e.g. echo state networks) for forecasting (univariate) time series? In reality, I am dealing with a highly chaotic and volatile time series. I took the natural logarithm of my time series : the predictions look good , but suppose the actual value of a future observation is 17 and the predicted value is 16 .... (e^16)/(e^17) is a lot more different than 16/17. I am hoping that the echo state network might be able to improve the quality of the prediction.
Do you have any idea, how many future points the echo state network is meant to forecast? I find with ARIMA, after 2-3 points the quality of prediction significantly drops.
In the future, if you have time, it would be very interesting if a function could be written that "trains" the echo state network you created. This way, different combinations of max_lag and n_res can be tried simultaneously.
I can not thank you enough for all your help!
Merry Christmas and a Happy New Year!
Over here, you can see another version of Echo State Networks for (univariate) Time Series written in base R: https://mantas.info/code/simple_esn/
# A minimalistic Echo State Networks demo with Mackey-Glass (delay 17) data
# in "plain" R.
# by Mantas Lukosevicius 2012-2018
# http://mantas.info
myfile <- read.table(url("https://mantas.info/wp/wp-content/uploads/simple_esn/MackeyGlass_t17.txt"))
# load the data
trainLen = 2000
testLen = 2000
initLen = 100
data = as.matrix(myfile)
# plot some of it
while( dev.cur() != 1 ) dev.off() # close all previous plots
dev.new()
plot(data[1:1000],type='l')
title(main='A sample of data')
# generate the ESN reservoir
inSize = outSize = 1
resSize = 1000
a = 0.3 # leaking rate
set.seed(42)
Win = matrix(runif(resSize*(1+inSize),-0.5,0.5),resSize)
W = matrix(runif(resSize*resSize,-0.5,0.5),resSize)
# normalizing and setting spectral radius
cat('Computing spectral radius...')
rhoW = abs(eigen(W,only.values=TRUE)$values[1])
print('done.')
W = W * 1.25 / rhoW
# allocated memory for the design (collected states) matrix
X = matrix(0,1+inSize+resSize,trainLen-initLen)
# set the corresponding target matrix directly
Yt = matrix(data[(initLen+2):(trainLen+1)],1)
# run the reservoir with the data and collect X
x = rep(0,resSize)
for (t in 1:trainLen){
u = data[t]
x = (1-a)*x + a*tanh( Win %*% rbind(1,u) + W %*% x )
if (t > initLen)
X[,t-initLen] = rbind(1,u,x)
}
# train the output
reg = 1e-8 # regularization coefficient
X_T = t(X)
Wout = Yt %*% X_T %*% solve( X %*% X_T + reg*diag(1+inSize+resSize) )
# run the trained ESN in a generative mode. no need to initialize here,
# because x is initialized with training data and we continue from there.
Y = matrix(0,outSize,testLen)
u = data[trainLen+1]
for (t in 1:testLen){
x = (1-a)*x + a*tanh( Win %*% rbind(1,u) + W %*% x )
y = Wout %*% rbind(1,u,x)
Y[,t] = y
# generative mode:
u = y
# this would be a predictive mode:
#u = data[trainLen+t+1]
}
# compute MSE for the first errorLen time steps
errorLen = 500
mse = ( sum( (data[(trainLen+2):(trainLen+errorLen+1)] - Y[1,1:errorLen])^2 )
/ errorLen )
print( paste( 'MSE = ', mse ) )
# plot some signals
dev.new()
plot( data[(trainLen+1):(trainLen+testLen+1)], type='l', col='green' )
lines( c(Y), col='blue' )
title(main=expression(paste('Target and generated signals ', bold(y)(italic(n)),
' starting at ', italic(n)==0 )))
legend('bottomleft',legend=c('Target signal', 'Free-running predicted signal'),
col=c('green','blue'), lty=1, bty='n' )
dev.new()
matplot( t(X[(1:20),(1:200)]), type='l' )
title(main=expression(paste('Some reservoir activations ', bold(x)(italic(n)))))
dev.new()
barplot( Wout )
title(main=expression(paste('Output weights ', bold(W)^{out})))
Hello, great job creating this package for echo state networks in R!
I an trying to use your code for a univariate time series I generated and I am running into some problems relating to "missing values not supported". Could you please take a look at this if you have time?
Here is the error:
I tried doing as the error message suggested:
But the same error happens. This error is preventing me from running the rest of the code:
Could you please take a look at this if you have time? Perhaps I am doing something wrong and you could kindly show me how to fix this problem?
Thanks you so much!