rstudio / keras3

R Interface to Keras
https://keras3.posit.co/
Other
838 stars 283 forks source link

Modeling multivariate time-series data with FCN #664

Closed smorford closed 5 years ago

smorford commented 5 years ago

Hi all, I'm looking for some assistance with implementing the FCN architecture on multivariate time-series data. I've attempted to translate the FCN architecture from this python implementation. I'm new to using deep learning frameworks, so it's very likely I'm implementing this incorrectly.

My input is a 3 dimensional array [samples, time step,feature]. My response variable is a 1d array with five classes, encoded as integers 0 through 4.

I can train the model, but the predict result returns (roughly) the fractional total of each of the classes for every sample. I'm wondering if anyone can point me to what I'm doing wrong.

The code segment below will download the training data from a google cloud bucket (~111MB - beware!)

Any help would be greatly appreciated.

library(keras)
# set working dir
# change this to a scratch location of your choosing
workingDir <- "/nfs/scratch/fcn"

#specify destination of file
dest.file <- paste(workingDir,"/trainingData.RData",sep="")

#get data and load
#file is ~ 111 MBytes
download.file("http://data.terra-analytics.net/train_data.RData",dest.file)
load(dest.file)

#'x' data is a 3-dimensional array [x,y,z] with dimensions [n,66,32] where
#x is the sample
#y is timestep
#z is the feature

dim(x_train)
dim(x_val)

#'y' data is the output classes (n=5) encoded as a 1d array 
#with integer values of 0 to 4
dim(y_train)
dim(y_val)
str(y_train)
unique(y_train)

#implement the FCN architecture based on
#https://github.com/hfawaz/dl-4-tsc/blob/master/classifiers/fcn.py

get_fcn <- function(input_shape = c(66,32),
                    num_classes = 5) {

  inputs <- layer_input(shape = input_shape)

  conv1 <- inputs %>%
    layer_conv_1d(filters = 128, kernel_size = c(8), padding = "same") %>%
    layer_batch_normalization() %>%
    layer_activation("relu") 

  conv2 <- conv1 %>%
    layer_conv_1d(filters = 256, kernel_size = c(5), padding = "same") %>%
    layer_batch_normalization() %>%
    layer_activation("relu") 

  conv3 <- conv2 %>%
    layer_conv_1d(filters = 128, kernel_size = c(3), padding = "same") %>%
    layer_batch_normalization() %>%
    layer_activation("relu") 

  gap_layer <- conv3 %>% 
    layer_global_average_pooling_1d()

  output_layer <- gap_layer %>%
    layer_dense(units = num_classes, activation = "softmax")      

  model <- keras_model(
    inputs = inputs,
    outputs = output_layer
  )

  model %>% compile(
    optimizer = optimizer_adam(),
    loss = 'sparse_categorical_crossentropy',
    metrics = 'accuracy'
  )

  return(model)
}

callbacks <- list(callback_reduce_lr_on_plateau(monitor = 'loss', factor = 0.5, patience = 50, min_lr =0.0001),
                  callback_model_checkpoint(filepath = "satmodel_test.hd5",monitor = 'loss', save_best_only = TRUE))

model <- get_fcn()

train_history <- model %>% fit(
  x = x_train, 
  y = y_train,
  validation_data = list(x_val,y_val),
  callbacks = callbacks,
  epochs = 50,
  batch_size = 100)

plot(train_history)

out_y <- predict(model,x_val)

out_y[1,]
#[1] 0.68179649 0.12124907 0.12744266 0.04772407 0.02178771

out_y[1000,]
#[1] 0.68179649 0.12124907 0.12744266 0.04772407 0.02178771

sum(y_val == 0)/length(y_val)
#[1] 0.6845466

sum(y_val == 1)/length(y_val)
#[1] 0.1272882
skeydan commented 5 years ago

The R code itself looks good to me,- did you try running the Python version and see if you get the same results? Is the original Python version supposed to be working with multivariate data?

smorford commented 5 years ago

Thanks skeydan. I found that if I reduced the number of features in the training data, the model worked correctly. This can be closed.