Using time series example -CPU

ankkhedia commented 5 years ago

I am trying to use time series example from https://jeremiedb.github.io/mxnet_R_bucketing/TimeSeries_CPU with latest MXNetR

However, when I follow the tutorial, it fails at

infer <- mx.infer.rnn.one.unroll(infer.data = data, 
                              symbol = symbol.infer.ini,                                  
                              num_hidden = 50,
                           arg.params = model$arg.params,
                          aux.params = model$aux.params,
                         init_states = NULL,
                               ctx = ctx)

The error message is

Error in mxnet:::mx.symbol.bind(symbol = symbol, arg.arrays = c(dlist, : executor.cc:124: RCheck failed: Rcpp::is(source_array[i]) Expect input arg_arrays to be list of MXNDArray

Could you help me on how to use this API and what maybe going wrong.

jeremiedb commented 5 years ago

Have you try to run: TimeSeries_CPU.Rmd from the docs_make ? There's another graph definition at line 110 in the above file which defines a seq_length to matches the data rather than the seq_length of 2 shown in the html. Reason was to have a reasonabled size graph to display, but I html missed to make it clear. Let me know if this was root issue and I'll update the doc.

As symbol.RNN is now supporting CPU, it might be worth to update that part, it will make the demo simpler.

jeremiedb commented 5 years ago

@ankkhedia The html has just been updated to show all the code provide little more guidance on the different steps.

ankkhedia commented 5 years ago

@jeremiedb Thanks a lot for fixing the html. Looks good now.

ankkhedia commented 5 years ago

@jeremiedb I am trying to work on a m multivariate LSTM example following the CPU time series example. The training phase do work fine. However, when trying to do inference , I get the following error. Error in symbol$infer.shape(c(input.shape, init_states_shapes)) : Error in operator loss: Shape inconsistent, Provided=[300], inferred shape=[100,1]

Please find my code attached.


library("readr")
library("dplyr")
library("plotly")
library("mxnet")
library("abind")
# number of timestamps
seq_len = 100

# nunmber of samples
n = 500

# return a random starting point of the time series
set.seed(12)
seeds <- runif(n, min = 0, max = 24)

# generate the time series of seq_length for each starting point
pts <- sapply(seeds, function(x) sin(x + pi/12 * (0:(seq_len))))

# build the features matrix
x <- pts[1:seq_len, ]
x <- matrix(x,nrow=100)
x1<- 10*x
x2<-100*x
x<- abind(x,x1,x2, along=0)
# build the target array - same as feature but one timestep forward
y <- pts[-1, ]

batch.size = 32

# take first 400 samples for train - remaining 100 for evaluation
train_ids <- 1:400

train.data <- mx.io.arrayiter(data = x[,,train_ids, drop = F], label = y[, train_ids],
                              batch.size = batch.size, shuffle = TRUE)

eval.data <- mx.io.arrayiter(data = x[,,-train_ids, drop = F], label = y[, -train_ids],
                              batch.size = batch.size, shuffle = FALSE)

symbol <- rnn.graph.unroll(seq_len = 100,
                           num_rnn_layer =  1,
                           num_hidden = 50,
                           input_size = NULL,
                           num_embed = NULL,
                           num_decode = 1,
                           masking = F,
                           loss_output = "linear",
                           dropout = 0.2,
                           ignore_label = -1,
                           cell_type = "lstm",
                           output_last_state = F,
                           config = "one-to-one")
mx.metric.mse.seq <- mx.metric.custom("MSE", function(label, pred) {
  label = mx.nd.reshape(label, shape = -1)
  label = as.array(label)
  pred = as.array(pred)
  res <- mean((label-pred)^2)
  return(res)
})

ctx <- mx.cpu()

initializer <- mx.init.Xavier(rnd_type = "gaussian",
                              factor_type = "avg",
                              magnitude = 2.5)

optimizer <- mx.opt.create("adadelta", rho = 0.9, eps = 1e-5, wd = 0,
                           clip_gradient = 1, rescale.grad = 1/batch.size)

logger <- mx.metric.logger()
epoch.end.callback <- mx.callback.log.train.metric(period = 10, logger = logger)

system.time(
  model <- mx.model.buckets(symbol = symbol,
                            train.data = train.data,
                            eval.data = eval.data,
                            num.round = 10, ctx = ctx, verbose = TRUE,
                            metric = mx.metric.mse.seq,
                            initializer = initializer, optimizer = optimizer,
                            batch.end.callback = NULL,
                            epoch.end.callback = epoch.end.callback)
)

ctx <- mx.cpu()

## inference
pred_length = 80
data = mx.nd.array(x[, , 1, drop = F])
infer_length_ini <- dim(data)[2]

symbol.infer.ini <- rnn.graph.unroll(seq_len = infer_length_ini,
                                     num_rnn_layer = 1,
                                     num_hidden = 50,
                                     input_size = NULL,
                                     num_embed = NULL,
                                     num_decode = 1,
                                     masking = F,
                                     loss_output = "linear",
                                     dropout = 0,
                                     ignore_label = -1,
                                     cell_type = "lstm",
                                     output_last_state = T,
                                     config = "one-to-one")

symbol.infer <- rnn.graph.unroll(seq_len = 1,
                                 num_rnn_layer = 1,
                                 num_hidden = 50,
                                 input_size = NULL,
                                 num_embed = NULL,
                                 num_decode = 1,
                                 masking = F,
                                 loss_output = "linear",
                                 dropout = 0,
                                 ignore_label = -1,
                                 cell_type = "lstm",
                                 output_last_state = T,
                                 config = "one-to-one")
predict <- numeric()

infer <- mx.infer.rnn.one.unroll(infer.data = data,
                                 symbol = symbol.infer.ini,
                                 num_hidden = 50,
                                 arg.params = model$arg.params,
                                 aux.params = model$aux.params,
                                 init_states = NULL,
                                 ctx = ctx)

pred = mx.nd.array(y[seq_len, 1, drop = F])
real = sin(seeds[1] + pi/12 * (seq_len+1):(seq_len+pred_length))

for (i in 1:pred_length) {

  data = mx.nd.reshape(pred, shape = c(1,1,1))

  infer <- mx.infer.rnn.one.unroll(infer.data = data,
                                   symbol = symbol.infer,
                                   num_hidden = 50,
                                   arg.params = model$arg.params,
                                   aux.params = model$aux.params,
                                   #init_states = infer[-1],
                                   init_states = NULL,
                                   ctx = ctx)

  pred <- infer[[1]]
  predict <- c(predict, as.numeric(as.array(pred)))
}

I faced the error in this line

infer <- mx.infer.rnn.one.unroll(infer.data = data,
                                 symbol = symbol.infer.ini,
                                 num_hidden = 50,
                                 arg.params = model$arg.params,
                                 aux.params = model$aux.params
                                 init_states = NULL,
                                 ctx = ctx)

Could you help me figure out what is going wrong or what is getting missed? Appreciate your help. The code is almost similar to univariate time series tutorial.

ankkhedia commented 5 years ago

@jeremiedb I also tried using GPU time series example. It doesn't work as is and works after making metric as null for univariate time series.

When I tried to extend it to multi-dimensional time series, it crashes the R kernel, not sure as to what is going wrong.

jeremiedb commented 5 years ago

@ankkhedia I just pushed the fix. Issue was with the custom metric that hasn't been updated for the handling nd.array rather than R array. Thanks for pointing out! I will do a review of the other tutorials this weekend, there might be remaining issues with the custom eval metrics elsewhere.

ankkhedia commented 5 years ago

@jeremiedb Thanks for the quick turnaround. I would really appreciate if you could help me with the multivariate example which I posted above.

jeremiedb commented 5 years ago

Looks like the graph shape inference from the unrolled is shaky. I'll need some time to dig further. Have you tried the symbol.RNN approach (as usedd in the GPU example)? It should be easier to get it work.

ankkhedia commented 5 years ago

Will it work for Cpu too, its mentioned in the spec that it is only for CUDA.

jeremiedb commented 5 years ago

CPU appears fully supported by symbol.RNN since 1.3.1 (prior to 1.3.1, dropout wasn't supported).

jeremiedb / mxnet_R_bucketing

Using time series example -CPU #5