rstudio / keras3

R Interface to Keras
https://keras3.posit.co/
Other
837 stars 283 forks source link

Allaire Audio Classification Error #586

Closed bmitre closed 5 years ago

bmitre commented 6 years ago

I have an error when trying to run the following code (see very bottom) from Allaire's Github page. I would appreciate all the help I can get. Thank you.

After downloading the speech commands, I first run the following:

https://github.com/jjallaire/optimizing-audio-classification/blob/master/util/splits.R

I then run the following:

https://github.com/jjallaire/optimizing-audio-classification/blob/master/train.R

The first error I get is after sess$run(tf$tables_initializerI()). See the following:

sess$run(tf$tables_initializer()) 2018-11-03 11:06:41.031782: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. 2018-11-03 11:06:41.031788: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. 2018-11-03 11:06:41.031895: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. 2018-11-03 11:06:41.031782: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. 2018-11-03 11:06:41.032185: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. 2018-11-03 11:06:41.032452: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at lookup_table_init_op.cc:83 : Failed precondition: Table already initialized. Error in py_call_impl(callable, dots$args, dots$keywords) : FailedPreconditionError: Table already initialized. [[Node: string_to_index/hash_table/table_init = InitializeTableV2[Tkey=DT_STRING, Tval=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"](string_to_index/hash_table, Const, string_to_index/ToInt64)]]

Caused by op 'string_to_index/hash_table/table_init', defined at: File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\contrib\lookup\lookup_ops.py", line 138, in index_table_from_tensor name=name) File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\python\ops\lookup_ops.py", line 1127, in index_table_from_tensor init, default_value, shared_name=shared_name, name=hash_table_scope) File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\python\ops\lookup_ops.py", line 279, in init super(HashTable, self).init(table_ref, default_value, initializer) File**

The final error I get is after model %>% fit(). See the following:

model %>% fit(

Detailed traceback: File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 168, in make_one_shot_iterator "(Original error: %s)" % err)**

skeydan commented 6 years ago

Hello,

this is pretty hard to read because of the formatting. Can you provide the code in a correctly formatted way so it can easily be executed? Speaking of which, it's always best to have short, executable examples so we can reproduce (or try to reproduce) the error.

skeydan commented 5 years ago

Hi, sorry I didn't see you've edited the text or else I'd have responded earlier.

Unfortunately, this error

ValueError: Failed to create a one-shot iterator for a dataset. `Dataset.make_one_shot_iterator()` does not support datasets that capture stateful objects, such as a `Variable` or `LookupTable`. In these cases, use `Dataset.make_initializable_iterator()`. (Original error: Cannot capture a stateful node (name:string_to_index/hash_table, type:HashTableV2) by value.)

does indeed occur but on the R side, we don't have any control over it (if we use Keras to fit the model).

However, I think you should be able to use the very similar code from the blog:

https://blogs.rstudio.com/tensorflow/posts/2018-06-06-simple-audio-classification-keras/

bmitre commented 5 years ago

Hi Sigrid,

Thank you for pointing that out. I’m still trouble-shooting. But now I know what not to concentrate on. I’ll let you know where I go with this, and when I get to where I’m going.

Thank you, Beverly (-:

From: Sigrid Keydana notifications@github.com Sent: Thursday, November 8, 2018 11:23 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

Hi, sorry I didn't see you've edited the text or else I'd have responded earlier.

Unfortunately, this error

ValueError: Failed to create a one-shot iterator for a dataset. Dataset.make_one_shot_iterator() does not support datasets that capture stateful objects, such as a Variable or LookupTable. In these cases, use Dataset.make_initializable_iterator(). (Original error: Cannot capture a stateful node (name:string_to_index/hash_table, type:HashTableV2) by value.)

does indeed occur but on the R side, we don't have any control over it (if we use Keras to fit the model).

However, I think you should be able to use the very similar code from the blog:

https://blogs.rstudio.com/tensorflow/posts/2018-06-06-simple-audio-classification-keras/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437057671, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiDiYqIC3Qwxt1o814J99CFsMjLcaks5utFpugaJpZM4YMSCK.

bmitre commented 5 years ago

Hi Sigrid,

Here’s a new reprex with Falbel’s code. The error is at the very bottom. Any thoughts?

library(stringr) library(dplyr)

>

> Attaching package: 'dplyr'

> The following objects are masked from 'package:stats':

>

> filter, lag

> The following objects are masked from 'package:base':

>

> intersect, setdiff, setequal, union

library(fs) library(keras)

files <- fs::dir_ls( path = "C:/Users/bgaucher/Documents/data/speech_commands_v0.01/", recursive = TRUE, glob = "*.wav" )

files <- files[!str_detect(files, "background_noise")]

df <- data_frame( fname = files, class = fname %>% str_extract("1/.*/") %>% str_replace_all("1/", "") %>% str_replace_all("/", ""), class_id = class %>% as.factor() %>% as.integer() - 1L )

library(tfdatasets) ds <- tensor_slices_dataset(df)

window_size_ms <- 30 window_stride_ms <- 10

window_size <- as.integer(16000window_size_ms/1000) stride <- as.integer(16000window_stride_ms/1000)

fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1) n_chunks <- length(seq(window_size/2, 16000-window_size/2, stride))

shortcuts to used TensorFlow modules.

audio_ops <- tf$contrib$framework$python$ops$audio_ops

ds <- ds %>% dataset_map(function(obs) {

# a good way to debug when building tfdatsets pipelines is to use a print
# statement like this:
# print(str(obs))

# decoding wav files
audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))
wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

# create the spectrogram
spectrogram <- audio_ops$audio_spectrogram(
  wav$audio,
  window_size = window_size,
  stride = stride,
  magnitude_squared = TRUE
)

# normalization
spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)

# moving channels to last dim
spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

# transform the class_id into a one-hot encoded vector
response <- tf$one_hot(obs$class_id, 30L)

list(spectrogram, response)

})

ds <- ds %>% dataset_shuffle(buffer_size = 100) %>% dataset_repeat() %>% dataset_padded_batch( batch_size = 32, padded_shapes = list( shape(n_chunks, fft_size, NULL), shape(NULL) ) )

data_generator <- function(df, batch_size, shuffle = TRUE, window_size_ms = 30, window_stride_ms = 10) {

window_size <- as.integer(16000window_size_ms/1000) stride <- as.integer(16000window_stride_ms/1000) fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1) n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

ds <- tensor_slices_dataset(df)

if (shuffle) ds <- ds %>% dataset_shuffle(buffer_size = 100)

ds <- ds %>% dataset_map(function(obs) {

  # decoding wav files
  audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))
  wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

  # create the spectrogram
  spectrogram <- audio_ops$audio_spectrogram(
    wav$audio,
    window_size = window_size,
    stride = stride,
    magnitude_squared = TRUE
  )

  spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)
  spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

  # transform the class_id into a one-hot encoded vector
  response <- tf$one_hot(obs$class_id, 29L) #29L

  list(spectrogram, response)
}) %>%
dataset_repeat()

ds <- ds %>% dataset_padded_batch(batch_size, list(shape(n_chunks, fft_size, NULL), shape(NULL)))

ds }

set.seed(6) id_train <- sample(nrow(df), size = 0.7*nrow(df))

ds_train <- data_generator( df[id_train,], batch_size = 32, window_size_ms = 30, window_stride_ms = 10 ) ds_validation <- data_generator( df[-id_train,], batch_size = 32, shuffle = FALSE, window_size_ms = 30, window_stride_ms = 10 )

sess <- tf$Session()

batch <- next_batch(ds_train)

str(sess$run(batch))

window_size <- as.integer(16000window_size_ms/1000) stride <- as.integer(16000window_stride_ms/1000) fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1) n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

model <- keras_model_sequential() %>% layer_conv_2d(input_shape = c(n_chunks, fft_size, 1), filters = 32, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 128, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 256, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_dropout(rate = 0.25) %>% layer_flatten() %>% layer_dense(units = 128, activation = 'relu') %>% layer_dropout(rate = 0.5) %>% layer_dense(units = 30, activation = 'softmax')

model %>% compile( loss = loss_categorical_crossentropy, optimizer = optimizer_adadelta(), metrics = c('accuracy') )

model %>% fit_generator( generator = ds_train, steps_per_epoch = 0.7nrow(df)/32, epochs = 10, validation_data = ds_validation, validation_steps = 0.3nrow(df)/32 )

> Warning in normalizePath(path.expand(path), winslash, mustWork):

> path[1]="": The filename, directory name, or volume label syntax is

> incorrect

> Warning in normalizePath(path.expand(path), winslash, mustWork):

> path[1]="": The filename, directory name, or volume label syntax is

> incorrect

> Error in py_call_impl(callable, dots$args, dots$keywords): InvalidArgumentError: Incompatible shapes: [32,29] vs. [32,30]

> [[Node: training/Adadelta/gradients/loss/dense_2_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@training/Adadelta/gradients/loss/dense_2_loss/mul_grad/Reshape_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/Adadelta/gradients/loss/dense_2_loss/mul_grad/Shape, training/Adadelta/gradients/loss/dense_2_loss/mul_grad/Shape_1)]]

>

> Detailed traceback:

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper

> return func(*args, **kwargs)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator

> initial_epoch=initial_epoch)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\engine\training_generator.py", line 217, in fit_generator

> class_weight=class_weight)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\engine\training.py", line 1217, in train_on_batch

> outputs = self.train_function(ins)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in call

> return self._call(inputs)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call

> fetched = self._callable_fn(*array_vals)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\python\client\session.py", line 1382, in call

> run_metadata_ptr)

> File "C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in exit

> c_api.TF_GetCode(self.status.status))

Thank you, Beverly

From: Sigrid Keydana notifications@github.com Sent: Thursday, November 8, 2018 11:23 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

Hi, sorry I didn't see you've edited the text or else I'd have responded earlier.

Unfortunately, this error

ValueError: Failed to create a one-shot iterator for a dataset. Dataset.make_one_shot_iterator() does not support datasets that capture stateful objects, such as a Variable or LookupTable. In these cases, use Dataset.make_initializable_iterator(). (Original error: Cannot capture a stateful node (name:string_to_index/hash_table, type:HashTableV2) by value.)

does indeed occur but on the R side, we don't have any control over it (if we use Keras to fit the model).

However, I think you should be able to use the very similar code from the blog:

https://blogs.rstudio.com/tensorflow/posts/2018-06-06-simple-audio-classification-keras/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437057671, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiDiYqIC3Qwxt1o814J99CFsMjLcaks5utFpugaJpZM4YMSCK.

skeydan commented 5 years ago

can you please indicate the versions you're using

?

bmitre commented 5 years ago

tensorflow::tf_config() TensorFlow v1.10.0 (C:\Users\bgaucher\AppData\Local\CONTIN~1\ANACON~1\envs\R-TENS~1\lib\site-packages\keras__init__.p) Python v3.6 (C:\Users\bgaucher\AppData\Local\Continuum\anaconda3\envs\r-tensorflow\python.exe) keras:::keras_version() [1] ‘2.2.4’

From: Sigrid Keydana notifications@github.com Sent: Friday, November 9, 2018 10:10 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

can you please indicate the versions you're using

?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437388019, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiKnQHStWFExzO6pjtRARb0xglcleks5utZq3gaJpZM4YMSCK.

skeydan commented 5 years ago

thanks, and versions of R packages?

bmitre commented 5 years ago

sessionInfo() R version 3.4.4 (2018-03-15) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] reticulate_1.10 reprex_0.2.1 tfdatasets_1.9 fs_1.2.6 [5] dplyr_0.7.6 stringr_1.3.1 keras_2.2.0 RPostgreSQL_0.6-2 [9] DBI_1.0.0

loaded via a namespace (and not attached): [1] Rcpp_0.12.17 compiler_3.4.4 pillar_1.3.0 bindr_0.1.1 [5] base64enc_0.1-3 tools_3.4.4 digest_0.6.15 zeallot_0.1.0 [9] MatchIt_3.0.2 debugme_1.1.0 evaluate_0.11 jsonlite_1.5 [13] tibble_1.4.2 lattice_0.20-35 pkgconfig_2.0.1 rlang_0.2.1 [17] Matrix_1.2-12 rstudioapi_0.7 yaml_2.1.19 bindrcpp_0.2.2 [21] knitr_1.20 rprojroot_1.3-2 grid_3.4.4 tidyselect_0.2.4 [25] glue_1.3.0 R6_2.3.0 processx_3.2.0 rmarkdown_1.10 [29] callr_2.0.4 purrr_0.2.5 clipr_0.4.1 magrittr_1.5 [33] whisker_0.3-2 ps_1.2.0 backports_1.1.2 tfruns_1.4 [37] htmltools_0.3.6 MASS_7.3-49 assertthat_0.2.0 tensorflow_1.9 [41] stringi_1.1.7 crayon_1.3.4

From: Sigrid Keydana notifications@github.com Sent: Friday, November 9, 2018 10:38 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

thanks, and versions of R packages?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437397418, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiKXCo_6-V_M19Lxe7CDyECslUj8Dks5utaFhgaJpZM4YMSCK.

skeydan commented 5 years ago

thanks, I'll get back to you!

bmitre commented 5 years ago

I changed the following highlighted parts of the code from 30 to 29 and the code started moving a bit: ################### data_generator <- function(df, batch_size, shuffle = TRUE, window_size_ms = 30, window_stride_ms = 10) {

window_size <- as.integer(16000window_size_ms/1000) stride <- as.integer(16000window_stride_ms/1000) fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1) n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

ds <- tensor_slices_dataset(df)

if (shuffle) ds <- ds %>% dataset_shuffle(buffer_size = 100)

ds <- ds %>% dataset_map(function(obs) {

  # decoding wav files
  audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))
  wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

  # create the spectrogram
  spectrogram <- audio_ops$audio_spectrogram(
    wav$audio,
    window_size = window_size,
    stride = stride,
    magnitude_squared = TRUE
  )

  spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)
  spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

  # transform the class_id into a one-hot encoded vector
  response <- tf$one_hot(obs$class_id, 29L) #29L

  list(spectrogram, response)
}) %>%
dataset_repeat()

ds <- ds %>% dataset_padded_batch(batch_size, list(shape(n_chunks, fft_size, NULL), shape(NULL)))

ds }

. . . model <- keras_model_sequential() %>% layer_conv_2d(input_shape = c(n_chunks, fft_size, 1), filters = 32, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 128, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_conv_2d(filters = 256, kernel_size = c(3,3), activation = 'relu') %>% layer_max_pooling_2d(pool_size = c(2, 2)) %>% layer_dropout(rate = 0.25) %>% layer_flatten() %>% layer_dense(units = 128, activation = 'relu') %>% layer_dropout(rate = 0.5) %>% layer_dense(units = 29, activation = 'softmax')

Beverly

From: Sigrid Keydana notifications@github.com Sent: Friday, November 9, 2018 10:41 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

thanks, I'll get back to you!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437398428, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiKAdlBSTEyNaF8gYUIT9d7lg-Njqks5utaIogaJpZM4YMSCK.

bmitre commented 5 years ago

After making the changes of 30 to 29 on the highlighted lines, I got the following output including error:

Epoch 1/10 Epoch 1/10 1415/1415 [==============================] - 2737s 2s/step - loss: 2.1113 - acc: 0.3595 - val_loss: 0.9420 - val_acc: 0.6909 Epoch 2/10 1415/1415 [==============================] - 2708s 2s/step - loss: 1.1739 - acc: 0.6244 - val_loss: 0.5909 - val_acc: 0.7939 Epoch 3/10 1415/1415 [==============================] - 2645s 2s/step - loss: 0.9674 - acc: 0.6883 - val_loss: 0.5832 - val_acc: 0.7949 Epoch 4/10 1415/1415 [==============================] - 2903s 2s/step - loss: 0.8968 - acc: 0.7144 - val_loss: 0.5151 - val_acc: 0.8194 Epoch 5/10 1415/1415 [==============================] - 2793s 2s/step - loss: 0.8815 - acc: 0.7243 - val_loss: 0.5381 - val_acc: 0.8109 Epoch 6/10 1415/1415 [==============================] - 3049s 2s/step - loss: 0.8513 - acc: 0.7353 - val_loss: 0.4639 - val_acc: 0.8338 Epoch 7/10 1415/1415 [==============================] - 19655s 14s/step - loss: 0.8655 - acc: 0.7357 - val_loss: 0.4930 - val_acc: 0.8374 Epoch 8/10 1415/1415 [==============================] - 2908s 2s/step - loss: 0.9355 - acc: 0.7280 - val_loss: 0.5676 - val_acc: 0.8157 Epoch 9/10 1415/1415 [==============================] - 2637s 2s/step - loss: 1.0010 - acc: 0.7189 - val_loss: 0.5819 - val_acc: 0.8128 Epoch 10/10 1415/1415 [==============================] - 2636s 2s/step - loss: 1.0958 - acc: 0.7060 - val_loss: 0.6453 - val_acc: 0.8086 Warning messages: 1: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="": The filename, directory name, or volume label syntax is incorrect 2: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="": The filename, directory name, or volume label syntax is incorrect

I also got the following plots, which don't look right. I look forward to all input/comments:

image

bmitre commented 5 years ago

And upon running the last of the code, I got this plot: df_validation <- df[-id_train,] n_steps <- nrow(df_validation)/32 + 1

predictions <- predict_generator( model, ds_validation, steps = n_steps ) str(predictions)

classes <- apply(predictions, 1, which.max) - 1

library(dplyr) library(alluvial) x <- df_validation %>% mutate(pred_class_id = head(classes, nrow(df_validation))) %>% left_join( df_validation %>% distinct(class_id, class) %>% rename(pred_class = class), by = c("pred_class_id" = "class_id") ) %>% mutate(correct = pred_class == class) %>% count(pred_class, class, correct)

alluvial( x %>% select(class, pred_class), freq = x$n, col = ifelse(x$correct, "lightblue", "red"), border = ifelse(x$correct, "lightblue", "red"), alpha = 0.6, hide = x$n < 20 )

image

skeydan commented 5 years ago

Hi, I've corrected the 29 to 30 in the blog (one piece of code actually had it but the generator function didn't).

But otherwise, this runs as intended for me.

For reference, this is the exact code I'm running (it seems like your code is different partly, for example, using a different optimizer):

library(stringr)
library(dplyr)
library(keras)
library(tfdatasets)

files <- fs::dir_ls(
  path = "data/speech_commands_v0.01/", 
  recursive = TRUE, 
  glob = "*.wav"
)

files <- files[!str_detect(files, "background_noise")]

df <- data_frame(
  fname = files, 
  class = fname %>% str_extract("1/.*/") %>% 
    str_replace_all("1/", "") %>%
    str_replace_all("/", ""),
  class_id = class %>% as.factor() %>% as.integer() %>% `-`(1L) 
)

ds <- tensor_slices_dataset(df)

window_size_ms <- 30
window_stride_ms <- 10

window_size <- as.integer(16000*window_size_ms/1000)
stride <- as.integer(16000*window_stride_ms/1000)

fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)
n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

audio_ops <- tf$contrib$framework$python$ops$audio_ops

ds <- ds %>%
  dataset_map(function(obs) {

    # a good way to debug when building tfdatsets pipelines is to use a print
    # statement like this:
    # print(str(obs))

    # decoding wav files
    audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))
    wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

    # create the spectrogram
    spectrogram <- audio_ops$audio_spectrogram(
      wav$audio, 
      window_size = window_size, 
      stride = stride,
      magnitude_squared = TRUE
    )

    # normalization
    spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)

    # moving channels to last dim
    spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

    # transform the class_id into a one-hot encoded vector
    response <- tf$one_hot(obs$class_id, 30L)

    list(spectrogram, response)
  })

ds <- ds %>% 
  dataset_shuffle(buffer_size = 100) %>%
  dataset_repeat() %>%
  dataset_padded_batch(
    batch_size = 32, 
    padded_shapes = list(
      shape(n_chunks, fft_size, NULL), 
      shape(NULL)
    )
  )
data_generator <- function(df, batch_size, shuffle = TRUE, 
                           window_size_ms = 30, window_stride_ms = 10) {

  window_size <- as.integer(16000*window_size_ms/1000)
  stride <- as.integer(16000*window_stride_ms/1000)
  fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)
  n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

  ds <- tensor_slices_dataset(df)

  if (shuffle) 
    ds <- ds %>% dataset_shuffle(buffer_size = 100)  

  ds <- ds %>%
    dataset_map(function(obs) {

      # decoding wav files
      audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))
      wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

      # create the spectrogram
      spectrogram <- audio_ops$audio_spectrogram(
        wav$audio, 
        window_size = window_size, 
        stride = stride,
        magnitude_squared = TRUE
      )

      spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)
      spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

      # transform the class_id into a one-hot encoded vector
      response <- tf$one_hot(obs$class_id, 30L)

      list(spectrogram, response)
    }) %>%
    dataset_repeat()

  ds <- ds %>% 
    dataset_padded_batch(batch_size, list(shape(n_chunks, fft_size, NULL), shape(NULL)))

  ds
}

set.seed(6)
id_train <- sample(nrow(df), size = 0.7*nrow(df))

ds_train <- data_generator(
  df[id_train,], 
  batch_size = 32, 
  window_size_ms = 30, 
  window_stride_ms = 10
)
ds_validation <- data_generator(
  df[-id_train,], 
  batch_size = 32, 
  shuffle = FALSE, 
  window_size_ms = 30, 
  window_stride_ms = 10
)

window_size <- as.integer(16000*window_size_ms/1000)
stride <- as.integer(16000*window_stride_ms/1000)
fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)
n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

model <- keras_model_sequential()
model %>%  
  layer_conv_2d(input_shape = c(n_chunks, fft_size, 1), 
                filters = 32, kernel_size = c(3,3), activation = 'relu') %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 128, kernel_size = c(3,3), activation = 'relu') %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 256, kernel_size = c(3,3), activation = 'relu') %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_dropout(rate = 0.25) %>% 
  layer_flatten() %>% 
  layer_dense(units = 128, activation = 'relu') %>% 
  layer_dropout(rate = 0.5) %>% 
  layer_dense(units = 30, activation = 'softmax')

model %>% compile(
  loss = loss_categorical_crossentropy,
  optimizer = optimizer_adadelta(),
  metrics = c('accuracy')
)

model %>% fit_generator(
  generator = ds_train,
  steps_per_epoch = 0.7*nrow(df)/32,
  epochs = 10, 
  validation_data = ds_validation, 
  validation_steps = 0.3*nrow(df)/32,
  callbacks = list(callback_model_checkpoint("weights.{epoch:02d}-{val_loss:.2f}.hdf5"))
)

df_validation <- df[-id_train,]
n_steps <- nrow(df_validation)/32 + 1

predictions <- predict_generator(
  model, 
  ds_validation, 
  steps = n_steps
)
str(predictions)

classes <- apply(predictions, 1, which.max) - 1

library(dplyr)
library(alluvial)
x <- df_validation %>%
  mutate(pred_class_id = head(classes, nrow(df_validation))) %>%
  left_join(
    df_validation %>% distinct(class_id, class) %>% rename(pred_class = class),
    by = c("pred_class_id" = "class_id")
  ) %>%
  mutate(correct = pred_class == class) %>%
  count(pred_class, class, correct)

alluvial(
  x %>% select(class, pred_class),
  freq = x$n,
  col = ifelse(x$correct, "lightblue", "red"),
  border = ifelse(x$correct, "lightblue", "red"),
  alpha = 0.6,
  hide = x$n < 20
)
bmitre commented 5 years ago

Hi Sigrid,

You rock! Thank you so much! Do the plots look right to you? What are they telling us? That we’re not overfitting? Don’t the losses start out as high, over 1 and over 2? I apologize if my questions are naïve.

Thank you, Beverly

[cid:image002.jpg@01D47A9C.3D896B90]

From: Sigrid Keydana notifications@github.com Sent: Monday, November 12, 2018 2:44 AM To: rstudio/keras keras@noreply.github.com Cc: Gaucher, Beverly J bgaucher@mitre.org; Author author@noreply.github.com Subject: Re: [rstudio/keras] Allaire Audio Classification Error (#586)

Hi, I've corrected the 29 to 30 in the blog (one piece of code actually had it but the generator function didn't).

But otherwise, this runs as intended for me.

For reference, this is the exact code I'm running (it seems like your code is different partly, for example, using a different optimizer):

library(stringr)

library(dplyr)

library(keras)

library(tfdatasets)

files <- fs::dir_ls(

path = "data/speech_commands_v0.01/",

recursive = TRUE,

glob = "*.wav"

)

files <- files[!str_detect(files, "background_noise")]

df <- data_frame(

fname = files,

class = fname %>% str_extract("1/.*/") %>%

str_replace_all("1/", "") %>%

str_replace_all("/", ""),

class_id = class %>% as.factor() %>% as.integer() %>% -(1L)

)

ds <- tensor_slices_dataset(df)

window_size_ms <- 30

window_stride_ms <- 10

window_size <- as.integer(16000*window_size_ms/1000)

stride <- as.integer(16000*window_stride_ms/1000)

fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)

n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

audio_ops <- tf$contrib$framework$python$ops$audio_ops

ds <- ds %>%

dataset_map(function(obs) {

# a good way to debug when building tfdatsets pipelines is to use a print

# statement like this:

# print(str(obs))

# decoding wav files

audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))

wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

# create the spectrogram

spectrogram <- audio_ops$audio_spectrogram(

  wav$audio,

  window_size = window_size,

  stride = stride,

  magnitude_squared = TRUE

)

# normalization

spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)

# moving channels to last dim

spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

# transform the class_id into a one-hot encoded vector

response <- tf$one_hot(obs$class_id, 30L)

list(spectrogram, response)

})

ds <- ds %>%

dataset_shuffle(buffer_size = 100) %>%

dataset_repeat() %>%

dataset_padded_batch(

batch_size = 32,

padded_shapes = list(

  shape(n_chunks, fft_size, NULL),

  shape(NULL)

)

)

data_generator <- function(df, batch_size, shuffle = TRUE,

                       window_size_ms = 30, window_stride_ms = 10) {

window_size <- as.integer(16000*window_size_ms/1000)

stride <- as.integer(16000*window_stride_ms/1000)

fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)

n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

ds <- tensor_slices_dataset(df)

if (shuffle)

ds <- ds %>% dataset_shuffle(buffer_size = 100)

ds <- ds %>%

dataset_map(function(obs) {

  # decoding wav files

  audio_binary <- tf$read_file(tf$reshape(obs$fname, shape = list()))

  wav <- audio_ops$decode_wav(audio_binary, desired_channels = 1)

  # create the spectrogram

  spectrogram <- audio_ops$audio_spectrogram(

    wav$audio,

    window_size = window_size,

    stride = stride,

    magnitude_squared = TRUE

  )

  spectrogram <- tf$log(tf$abs(spectrogram) + 0.01)

  spectrogram <- tf$transpose(spectrogram, perm = c(1L, 2L, 0L))

  # transform the class_id into a one-hot encoded vector

  response <- tf$one_hot(obs$class_id, 30L)

  list(spectrogram, response)

}) %>%

dataset_repeat()

ds <- ds %>%

dataset_padded_batch(batch_size, list(shape(n_chunks, fft_size, NULL), shape(NULL)))

ds

}

set.seed(6)

id_train <- sample(nrow(df), size = 0.7*nrow(df))

ds_train <- data_generator(

df[id_train,],

batch_size = 32,

window_size_ms = 30,

window_stride_ms = 10

)

ds_validation <- data_generator(

df[-id_train,],

batch_size = 32,

shuffle = FALSE,

window_size_ms = 30,

window_stride_ms = 10

)

window_size <- as.integer(16000*window_size_ms/1000)

stride <- as.integer(16000*window_stride_ms/1000)

fft_size <- as.integer(2^trunc(log(window_size, 2)) + 1)

n_chunks <- length(seq(window_size/2, 16000 - window_size/2, stride))

model <- keras_model_sequential()

model %>%

layer_conv_2d(input_shape = c(n_chunks, fft_size, 1),

            filters = 32, kernel_size = c(3,3), activation = 'relu') %>%

layer_max_pooling_2d(pool_size = c(2, 2)) %>%

layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>%

layer_max_pooling_2d(pool_size = c(2, 2)) %>%

layer_conv_2d(filters = 128, kernel_size = c(3,3), activation = 'relu') %>%

layer_max_pooling_2d(pool_size = c(2, 2)) %>%

layer_conv_2d(filters = 256, kernel_size = c(3,3), activation = 'relu') %>%

layer_max_pooling_2d(pool_size = c(2, 2)) %>%

layer_dropout(rate = 0.25) %>%

layer_flatten() %>%

layer_dense(units = 128, activation = 'relu') %>%

layer_dropout(rate = 0.5) %>%

layer_dense(units = 30, activation = 'softmax')

model %>% compile(

loss = loss_categorical_crossentropy,

optimizer = optimizer_adadelta(),

metrics = c('accuracy')

)

model %>% fit_generator(

generator = ds_train,

steps_per_epoch = 0.7*nrow(df)/32,

epochs = 10,

validation_data = ds_validation,

validation_steps = 0.3*nrow(df)/32,

callbacks = list(callback_model_checkpoint("weights.{epoch:02d}-{val_loss:.2f}.hdf5"))

)

df_validation <- df[-id_train,]

n_steps <- nrow(df_validation)/32 + 1

predictions <- predict_generator(

model,

ds_validation,

steps = n_steps

)

str(predictions)

classes <- apply(predictions, 1, which.max) - 1

library(dplyr)

library(alluvial)

x <- df_validation %>%

mutate(pred_class_id = head(classes, nrow(df_validation))) %>%

left_join(

df_validation %>% distinct(class_id, class) %>% rename(pred_class = class),

by = c("pred_class_id" = "class_id")

) %>%

mutate(correct = pred_class == class) %>%

count(pred_class, class, correct)

alluvial(

x %>% select(class, pred_class),

freq = x$n,

col = ifelse(x$correct, "lightblue", "red"),

border = ifelse(x$correct, "lightblue", "red"),

alpha = 0.6,

hide = x$n < 20

)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rstudio/keras/issues/586#issuecomment-437785552, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AqmmiAx6rjiz8LHthF2aWD0xesOxdfPZks5uuSasgaJpZM4YMSCK.

bmitre commented 5 years ago

Hello,

Did anybody get the following warning messages (below) and does anybody have any ideas?

model %>% fit_generator(

  • generator = ds_train,
  • steps_per_epoch = 0.7*nrow(df)/32,
  • epochs = 10,
  • validation_data = ds_validation,
  • validation_steps = 0.3*nrow(df)/32,
  • callbacks = list(callback_model_checkpoint("weights.{epoch:02d}-{val_loss:.2f}.hdf5"))
  • ) Epoch 1/10 1415/1415 [==============================] - 9372s 7s/step - loss: 2.1226 - acc: 0.3937 - val_loss: 0.9519 - val_acc: 0.7343 Epoch 2/10 1415/1415 [==============================] - 2692s 2s/step - loss: 1.2086 - acc: 0.6556 - val_loss: 0.6414 - val_acc: 0.8240 Epoch 3/10 1415/1415 [==============================] - 2636s 2s/step - loss: 1.0060 - acc: 0.7188 - val_loss: 0.6007 - val_acc: 0.8296 Epoch 4/10 1415/1415 [==============================] - 2928s 2s/step - loss: 0.9280 - acc: 0.7445 - val_loss: 0.7207 - val_acc: 0.8006 Epoch 5/10 1415/1415 [==============================] - 3326s 2s/step - loss: 0.9370 - acc: 0.7517 - val_loss: 0.5714 - val_acc: 0.8378 Epoch 6/10 1415/1415 [==============================] - 2728s 2s/step - loss: 0.9663 - acc: 0.7557 - val_loss: 0.7223 - val_acc: 0.8113 Epoch 7/10 1415/1415 [==============================] - 2633s 2s/step - loss: 1.0237 - acc: 0.7475 - val_loss: 0.6032 - val_acc: 0.8545 Epoch 8/10 1415/1415 [==============================] - 2520s 2s/step - loss: 1.0544 - acc: 0.7475 - val_loss: 0.5706 - val_acc: 0.8574 Epoch 9/10 1415/1415 [==============================] - 2404s 2s/step - loss: 1.0760 - acc: 0.7456 - val_loss: 0.6722 - val_acc: 0.8314 Epoch 10/10 1415/1415 [==============================] - 2453s 2s/step - loss: 1.1110 - acc: 0.7408 - val_loss: 0.6334 - val_acc: 0.8477 Warning messages: 1: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="": The filename, directory name, or volume label syntax is incorrect 2: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="": The filename, directory name, or volume label syntax is incorrect

df_validation <- df[-id_train,] n_steps <- nrow(df_validation)/32 + 1

predictions <- predict_generator(

  • model,
  • ds_validation,
  • steps = n_steps
  • ) Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="": The filename, directory name, or volume label syntax is incorrect
skeydan commented 5 years ago

Yes, I do get the warnings too, but they don't keep things from working.

For the questions above, the alluvial plot shows which classes were most often confounded with each other. And regarding the accuracy values, accuracy on the validation set is higher than on the training set so we're not overfitting. Given the number of classes involved, accuracy is pretty high too.