rstudio / tensorflow

TensorFlow for R
https://tensorflow.rstudio.com
Apache License 2.0
1.33k stars 321 forks source link

Unable to import TFdatasets into Keras #393

Closed lauritzd closed 4 years ago

lauritzd commented 4 years ago

I have been trying to use TFdatasets to load batches from a collection of 133 .csv into a keras GRU classification model, but have met several obstacles. Code, error messages, and config details from my best attempt follows. The code is adapted (poorly) from several examples on the rstudio tensorflow page (e.g., "R interface to TensorFlow Dataset API"). Data summary: The bulk of the data are time-samples from 57 electrodes. The target/DV are classifications in the form of an integer column (Emo). In this attempt, a function uses make_csv_dataset() to import from a list of files, uses one hot encoding (4 levels) before ending with dataset_prepare(). The output shape looks reasonable x =(Batch, 57), y = (Batch, 4), but keras produces Error in py_call_impl(callable, dots$args, dots$keywords) : KeyError (see below for full) for my first layer. I trying to get this code to cycle through a list of .csv, but I have selected only a small amount of data for this example. Apparently I am missing something important, but I haven't been able to determine what I need to do. I have some R experience and worked with keras models, but this is my first tfdatasets attempt. I wouldn't ask unless I exhausted all my options, so I humbly thank you in advance for any suggestions.

CODE: #################figure out how to use TFdatasets

load 133 csv (each with +1000 time samples 57 electrodes)

into a keras model for classification.

install.packages("devtools")

devtools::install_github(c("rstudio/reticulate"))

library(reticulate)

devtools::install_github(c("rstudio/tfdatasets"))

devtools::install_github(c("rstudio/tensorflow"))

devtools::install_github(c("rstudio/keras"))

library("tensorflow")

library("tensorflow") library("tfdatasets") library("keras") library("tidyverse")

rm(list=ls()) SAVE_Dir <- paste("F:/Peri_ERN/R/SubCat/Val40/OutModels", sep = "") EmoKey <- c("Val") NumCatz <- 4 #Number of categories NofTrainTime <- 128 #For a small size test

NofValidTimes <- 1 #ignore for now

NofEpochs <- 1 BatchN <- 32 EmoDir <- paste("F:/Peri_ERN/R/SubCat/Val40/Part/", sep = "") #To get CSV setwd(EmoDir) filesTE <- list.files(path = paste(EmoDir), full.names = TRUE)

Get the dataset, change Emo to a categorical variable, and output to keras

EEG_Class <- function(filesTE) { Clz <- make_csv_dataset(list(filesTE), batch_size = BatchN, header = TRUE, num_epochs = NofEpochs, shuffle = FALSE, prefetch_buffer_size = BatchN, #Recommended value is the number of batches consumed per training step sloppy = FALSE, num_rows_for_inference = 100) %>% dataset_map( function(record) { record$Emo <- tf$one_hot(record$Emo, 4L) record }) %>% dataset_prepare(x = -Emo, y = Emo, drop_remainder = TRUE, named_features = FALSE, named = TRUE) #%>% Clz %>% dataset_repeat() } #end function EEG_Class

output_shapes(Clz) #results in $x (32, 57), $y (32, 4)

output_types(Clz) #both <dtype: 'float32'>

Clz %>% reticulate::as_iterator() %>% reticulate::iter_next() %>% reticulate::py_to_r()

##########################model

GRU2

model <- NULL model <- keras_model_sequential() model %>%
layer_gru(units = 64, dropout = 0.25, recurrent_dropout = 0.25, return_sequences = TRUE, batch_size = BatchN,
reset_after = TRUE, recurrent_activation = "sigmoid" , input_shape = list(NULL, 57) ) %>% #End GRU1 layer_gru(units = 64, dropout = 0.25, recurrent_dropout = 0.25, batch_size = BatchN, reset_after = TRUE, recurrent_activation = "sigmoid" ) %>% #End GRU2 layer_dense(NumCatz, activation = "softmax", batch_size = BatchN )

model %>% compile( optimizer = optimizer_adam(), metrics= 'accuracy', loss = list("categorical_crossentropy") ) #End compile summary(model)

#######Run this model history <- model %>% fit(EEG_Class(filesTE), steps_per_epoch = NofTrainTime/BatchN, epochs = NofEpochs, #, #For this test verbose = 1,

callbacks = callbacks_list

) #End run model

############################################################

OUTPUT and ERROR MESSAGE

Train for 40 steps 1/40 [..............................] - ETA: 2sError in py_call_impl(callable, dots$args, dots$keywords) : KeyError: 'gru_input'

Detailed traceback: File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 728, in fit use_multiprocessing=use_multiprocessing) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 324, in fit total_epochs=epochs) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 123, in run_one_epoch batch_outs = execution_function(iterator) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 86, in execution_function distributed_function(input_fn)) File "C:\Users\Lauritz\Anaconda3\envs\r-reticulate\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 457, in call result = self._call(*args, **kwds) File "C:\Users\Lauritz

############################################################# CONFIG DETAILS

tensorflow::tf_config() TensorFlow v2.0.0 (C:\Users\Lauritz\ANACON~1\envs\R-RETI~1\lib\site-packages\tensorflow__init.p) Python v3.6 (C:/Users/Lauritz/Anaconda3/envs/r-reticulate/python.exe) reticulate::py_config() python: C:/Users/Lauritz/Anaconda3/envs/r-reticulate/python.exe libpython: C:/Users/Lauritz/Anaconda3/envs/r-reticulate/python36.dll pythonhome: C:/Users/Lauritz/Anaconda3/envs/r-reticulate version: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 14:00:49) [MSC v.1915 64 bit (AMD64)] Architecture: 64bit numpy: C:/Users/Lauritz/Anaconda3/envs/r-reticulate/Lib/site-packages/numpy numpy_version: 1.17.3 tensorflow: C:\Users\Lauritz\ANACON~1\envs\R-RETI~1\lib\site-packages\tensorflow\init__.p

python versions found: C:/Users/Lauritz/Anaconda3/envs/r-reticulate/python.exe C:/Users/Lauritz/Anaconda3/python.exe

version _
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 6.2
year 2019
month 12
day 12
svn rev 77560
language R
version.string R version 3.6.2 (2019-12-12) nickname Dark and Stormy Night
packageVersion("tensorflow") [1] ‘2.0.0’

alexanderbeatson commented 4 years ago

Please check your EGG_Class function. Try to run it outside of the model. Make sure there is no error at all, get a unique dim for all input and no all-zero one-hots. Your model builds, compiles and fits look good. Next time, you may use StackOverflow for the coding questions. GitHub issue section is especially for build issues and feature requests.

lauritzd commented 4 years ago

I sincerely apologize for posting in the wrong place. Thank you alexanderbeatson for the suggestions and the time you took to respond. I have yet to find a solution, but you have given me some ideas. Thanks again.