Closed lauritzd closed 4 years ago
Please check your EGG_Class function. Try to run it outside of the model. Make sure there is no error at all, get a unique dim for all input and no all-zero one-hots. Your model builds, compiles and fits look good. Next time, you may use StackOverflow for the coding questions. GitHub issue section is especially for build issues and feature requests.
I sincerely apologize for posting in the wrong place. Thank you alexanderbeatson for the suggestions and the time you took to respond. I have yet to find a solution, but you have given me some ideas. Thanks again.
I have been trying to use TFdatasets to load batches from a collection of 133 .csv into a keras GRU classification model, but have met several obstacles. Code, error messages, and config details from my best attempt follows. The code is adapted (poorly) from several examples on the rstudio tensorflow page (e.g., "R interface to TensorFlow Dataset API"). Data summary: The bulk of the data are time-samples from 57 electrodes. The target/DV are classifications in the form of an integer column (Emo). In this attempt, a function uses make_csv_dataset() to import from a list of files, uses one hot encoding (4 levels) before ending with dataset_prepare(). The output shape looks reasonable x =(Batch, 57), y = (Batch, 4), but keras produces Error in py_call_impl(callable, dots$args, dots$keywords) : KeyError (see below for full) for my first layer. I trying to get this code to cycle through a list of .csv, but I have selected only a small amount of data for this example. Apparently I am missing something important, but I haven't been able to determine what I need to do. I have some R experience and worked with keras models, but this is my first tfdatasets attempt. I wouldn't ask unless I exhausted all my options, so I humbly thank you in advance for any suggestions.
CODE: #################figure out how to use TFdatasets
load 133 csv (each with +1000 time samples 57 electrodes)
into a keras model for classification.
install.packages("devtools")
devtools::install_github(c("rstudio/reticulate"))
library(reticulate)
devtools::install_github(c("rstudio/tfdatasets"))
devtools::install_github(c("rstudio/tensorflow"))
devtools::install_github(c("rstudio/keras"))
library("tensorflow")
library("tensorflow") library("tfdatasets") library("keras") library("tidyverse")
rm(list=ls()) SAVE_Dir <- paste("F:/Peri_ERN/R/SubCat/Val40/OutModels", sep = "") EmoKey <- c("Val") NumCatz <- 4 #Number of categories NofTrainTime <- 128 #For a small size test
NofValidTimes <- 1 #ignore for now
NofEpochs <- 1 BatchN <- 32 EmoDir <- paste("F:/Peri_ERN/R/SubCat/Val40/Part/", sep = "") #To get CSV setwd(EmoDir) filesTE <- list.files(path = paste(EmoDir), full.names = TRUE)
Get the dataset, change Emo to a categorical variable, and output to keras
EEG_Class <- function(filesTE) { Clz <- make_csv_dataset(list(filesTE), batch_size = BatchN, header = TRUE, num_epochs = NofEpochs, shuffle = FALSE, prefetch_buffer_size = BatchN, #Recommended value is the number of batches consumed per training step sloppy = FALSE, num_rows_for_inference = 100) %>% dataset_map( function(record) { record$Emo <- tf$one_hot(record$Emo, 4L) record }) %>% dataset_prepare(x = -Emo, y = Emo, drop_remainder = TRUE, named_features = FALSE, named = TRUE) #%>% Clz %>% dataset_repeat() } #end function EEG_Class
output_shapes(Clz) #results in $x (32, 57), $y (32, 4)
output_types(Clz) #both <dtype: 'float32'>
Clz %>% reticulate::as_iterator() %>% reticulate::iter_next() %>% reticulate::py_to_r()
##########################model
GRU2
model <- NULL model <- keras_model_sequential() model %>%
layer_gru(units = 64, dropout = 0.25, recurrent_dropout = 0.25, return_sequences = TRUE, batch_size = BatchN,
reset_after = TRUE, recurrent_activation = "sigmoid" , input_shape = list(NULL, 57) ) %>% #End GRU1 layer_gru(units = 64, dropout = 0.25, recurrent_dropout = 0.25, batch_size = BatchN, reset_after = TRUE, recurrent_activation = "sigmoid" ) %>% #End GRU2 layer_dense(NumCatz, activation = "softmax", batch_size = BatchN )
model %>% compile( optimizer = optimizer_adam(), metrics= 'accuracy', loss = list("categorical_crossentropy") ) #End compile summary(model)
#######Run this model history <- model %>% fit(EEG_Class(filesTE), steps_per_epoch = NofTrainTime/BatchN, epochs = NofEpochs, #, #For this test verbose = 1,
callbacks = callbacks_list
) #End run model
############################################################
OUTPUT and ERROR MESSAGE
Train for 40 steps 1/40 [..............................] - ETA: 2sError in py_call_impl(callable, dots$args, dots$keywords) : KeyError: 'gru_input'
Detailed traceback: File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 728, in fit use_multiprocessing=use_multiprocessing) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 324, in fit total_epochs=epochs) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 123, in run_one_epoch batch_outs = execution_function(iterator) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 86, in execution_function distributed_function(input_fn)) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 457, in call result = self._call(*args, **kwds) File "C:\Users\Lauritz
############################################################# CONFIG DETAILS
python versions found: C:/Users/Lauritz/Anaconda3/envs/r-reticulate/python.exe C:/Users/Lauritz/Anaconda3/python.exe