Open ankkhedia opened 5 years ago
Have you try to run: TimeSeries_CPU.Rmd
from the docs_make
?
There's another graph definition at line 110 in the above file which defines a seq_length to matches the data rather than the seq_length of 2 shown in the html. Reason was to have a reasonabled size graph to display, but I html missed to make it clear. Let me know if this was root issue and I'll update the doc.
As symbol.RNN is now supporting CPU, it might be worth to update that part, it will make the demo simpler.
@ankkhedia The html has just been updated to show all the code provide little more guidance on the different steps.
@jeremiedb Thanks a lot for fixing the html. Looks good now.
@jeremiedb I am trying to work on a m multivariate LSTM example following the CPU time series example. The training phase do work fine. However, when trying to do inference , I get the following error. Error in symbol$infer.shape(c(input.shape, init_states_shapes)) : Error in operator loss: Shape inconsistent, Provided=[300], inferred shape=[100,1]
Please find my code attached.
library("readr")
library("dplyr")
library("plotly")
library("mxnet")
library("abind")
# number of timestamps
seq_len = 100
# nunmber of samples
n = 500
# return a random starting point of the time series
set.seed(12)
seeds <- runif(n, min = 0, max = 24)
# generate the time series of seq_length for each starting point
pts <- sapply(seeds, function(x) sin(x + pi/12 * (0:(seq_len))))
# build the features matrix
x <- pts[1:seq_len, ]
x <- matrix(x,nrow=100)
x1<- 10*x
x2<-100*x
x<- abind(x,x1,x2, along=0)
# build the target array - same as feature but one timestep forward
y <- pts[-1, ]
batch.size = 32
# take first 400 samples for train - remaining 100 for evaluation
train_ids <- 1:400
train.data <- mx.io.arrayiter(data = x[,,train_ids, drop = F], label = y[, train_ids],
batch.size = batch.size, shuffle = TRUE)
eval.data <- mx.io.arrayiter(data = x[,,-train_ids, drop = F], label = y[, -train_ids],
batch.size = batch.size, shuffle = FALSE)
symbol <- rnn.graph.unroll(seq_len = 100,
num_rnn_layer = 1,
num_hidden = 50,
input_size = NULL,
num_embed = NULL,
num_decode = 1,
masking = F,
loss_output = "linear",
dropout = 0.2,
ignore_label = -1,
cell_type = "lstm",
output_last_state = F,
config = "one-to-one")
mx.metric.mse.seq <- mx.metric.custom("MSE", function(label, pred) {
label = mx.nd.reshape(label, shape = -1)
label = as.array(label)
pred = as.array(pred)
res <- mean((label-pred)^2)
return(res)
})
ctx <- mx.cpu()
initializer <- mx.init.Xavier(rnd_type = "gaussian",
factor_type = "avg",
magnitude = 2.5)
optimizer <- mx.opt.create("adadelta", rho = 0.9, eps = 1e-5, wd = 0,
clip_gradient = 1, rescale.grad = 1/batch.size)
logger <- mx.metric.logger()
epoch.end.callback <- mx.callback.log.train.metric(period = 10, logger = logger)
system.time(
model <- mx.model.buckets(symbol = symbol,
train.data = train.data,
eval.data = eval.data,
num.round = 10, ctx = ctx, verbose = TRUE,
metric = mx.metric.mse.seq,
initializer = initializer, optimizer = optimizer,
batch.end.callback = NULL,
epoch.end.callback = epoch.end.callback)
)
ctx <- mx.cpu()
## inference
pred_length = 80
data = mx.nd.array(x[, , 1, drop = F])
infer_length_ini <- dim(data)[2]
symbol.infer.ini <- rnn.graph.unroll(seq_len = infer_length_ini,
num_rnn_layer = 1,
num_hidden = 50,
input_size = NULL,
num_embed = NULL,
num_decode = 1,
masking = F,
loss_output = "linear",
dropout = 0,
ignore_label = -1,
cell_type = "lstm",
output_last_state = T,
config = "one-to-one")
symbol.infer <- rnn.graph.unroll(seq_len = 1,
num_rnn_layer = 1,
num_hidden = 50,
input_size = NULL,
num_embed = NULL,
num_decode = 1,
masking = F,
loss_output = "linear",
dropout = 0,
ignore_label = -1,
cell_type = "lstm",
output_last_state = T,
config = "one-to-one")
predict <- numeric()
infer <- mx.infer.rnn.one.unroll(infer.data = data,
symbol = symbol.infer.ini,
num_hidden = 50,
arg.params = model$arg.params,
aux.params = model$aux.params,
init_states = NULL,
ctx = ctx)
pred = mx.nd.array(y[seq_len, 1, drop = F])
real = sin(seeds[1] + pi/12 * (seq_len+1):(seq_len+pred_length))
for (i in 1:pred_length) {
data = mx.nd.reshape(pred, shape = c(1,1,1))
infer <- mx.infer.rnn.one.unroll(infer.data = data,
symbol = symbol.infer,
num_hidden = 50,
arg.params = model$arg.params,
aux.params = model$aux.params,
#init_states = infer[-1],
init_states = NULL,
ctx = ctx)
pred <- infer[[1]]
predict <- c(predict, as.numeric(as.array(pred)))
}
I faced the error in this line
infer <- mx.infer.rnn.one.unroll(infer.data = data,
symbol = symbol.infer.ini,
num_hidden = 50,
arg.params = model$arg.params,
aux.params = model$aux.params
init_states = NULL,
ctx = ctx)
Could you help me figure out what is going wrong or what is getting missed? Appreciate your help. The code is almost similar to univariate time series tutorial.
@jeremiedb I also tried using GPU time series example. It doesn't work as is and works after making metric as null for univariate time series.
When I tried to extend it to multi-dimensional time series, it crashes the R kernel, not sure as to what is going wrong.
@ankkhedia I just pushed the fix. Issue was with the custom metric that hasn't been updated for the handling nd.array rather than R array. Thanks for pointing out! I will do a review of the other tutorials this weekend, there might be remaining issues with the custom eval metrics elsewhere.
@jeremiedb Thanks for the quick turnaround. I would really appreciate if you could help me with the multivariate example which I posted above.
Looks like the graph shape inference from the unrolled is shaky. I'll need some time to dig further. Have you tried the symbol.RNN approach (as usedd in the GPU example)? It should be easier to get it work.
Will it work for Cpu too, its mentioned in the spec that it is only for CUDA.
CPU appears fully supported by symbol.RNN since 1.3.1 (prior to 1.3.1, dropout wasn't supported).
I am trying to use time series example from https://jeremiedb.github.io/mxnet_R_bucketing/TimeSeries_CPU with latest MXNetR
However, when I follow the tutorial, it fails at
The error message is
Error in mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(dlist, : executor.cc:124: RCheck failed: Rcpp::is(source_array[i]) Expect input arg_arrays to be list of MXNDArray
Could you help me on how to use this API and what maybe going wrong.