vineeths96 / Spoken-Keyword-Spotting

In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.
MIT License
91 stars 18 forks source link

Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40]. #15

Open himrlawrrence opened 3 years ago

himrlawrrence commented 3 years ago

image

==> hi vinneths96,

i tried to use my own hotword -"get" insteand of "marvin" . Then added some get-wav files into train folder ,create_modle and when run "

Obtain the feature embeddings

X_train = feature_extractor.predict(get_data, use_multiprocessing=True)

" i got the err stack: Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40].

could you please help me out?

Thanks a lot.

JFU

vineeths96 commented 3 years ago

Hi JFU,

Can you post the entire error stack to understand which statement triggers this error? It's hard to understand from the current short snapshot you provided.

himrlawrrence commented 3 years ago

i'm glad to get ur quick reply. i have another quick question: trained wav files must be the same size, eg.32K OR, must be less than 1 second?

himrlawrrence commented 3 years ago

error stack is shown below:

C:\Users\JFU\anaconda3\envs\env38\python.exe C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py 2021-11-02 13:58:01.704671: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-11-02 13:58:01.704822: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Training model Dataset statistics Train files: 51410 Validation files: 6640 Dev test files: 6675 Test files: 2567 2021-11-02 13:58:07.875339: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found 2021-11-02 13:58:07.875475: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303) 2021-11-02 13:58:07.881674: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: JFU-LAPTOP 2021-11-02 13:58:07.881881: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: JFU-LAPTOP 2021-11-02 13:58:07.882268: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2021-11-02 13:58:07.891825: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x19452a35900 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-11-02 13:58:07.891975: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Model: "sequential" ... ... .. Total params: 930,403 Trainable params: 927,873 Non-trainable params: 2,530


Epoch 1/25 401/401 [==============================] - 221s 550ms/step - loss: 2.2773 - sparse_categorical_accuracy: 0.3448 - val_loss: 1.2852 - val_sparse_categorical_accuracy: 0.6045 - lr: 0.0010 Epoch 2/25 401/401 [==============================] - 269s 672ms/step - loss: 0.8611 - sparse_categorical_accuracy: 0.7339 - val_loss: 0.5980 - val_sparse_categorical_accuracy: 0.8114 - lr: 0.0010 Epoch 3/25 401/401 [==============================] - 290s 724ms/step - loss: 0.5668 - sparse_categorical_accuracy: 0.8260 - val_loss: 0.3616 - val_sparse_categorical_accuracy: 0.8905 - lr: 0.0010 ......... Epoch 23/25 401/401 [==============================] - 404s 1s/step - loss: 0.1639 - sparse_categorical_accuracy: 0.9494 - val_loss: 0.1798 - val_sparse_categorical_accuracy: 0.9487 - lr: 0.0010 Epoch 24/25 401/401 [==============================] - 408s 1s/step - loss: 0.1644 - sparse_categorical_accuracy: 0.9489 - val_loss: 0.1863 - val_sparse_categorical_accuracy: 0.9490 - lr: 0.0010 Epoch 25/25 401/401 [==============================] - 420s 1s/step - loss: 0.1572 - sparse_categorical_accuracy: 0.9519 - val_loss: 0.1741 - val_sparse_categorical_accuracy: 0.9487 - lr: 0.0010 Saving model Saving training history Traceback (most recent call last): File "C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py", line 25, in main() File "C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py", line 21, in main get_kws_model() File "C:\WORKSPACE\Spoken-Keyword-Spotting\src\model_train.py", line 141, in get_kws_model X_train = feature_extractor.predict(get_data, use_multiprocessing=True) File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 88, in _method_wrapper return method(self, *args, *kwargs) File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1268, in predict tmp_batch_outputs = predict_function(iterator) File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\def_function.py", line 580, in call result = self._call(args, **kwds) File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\def_function.py", line 650, in _call return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 1661, in _filtered_call return self._call_flat( File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 1745, in _call_flat return self._build_call_outputs(self._inference_function.call( File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 593, in call outputs = execute.execute( File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40]. [[node IteratorGetNext (defined at \WORKSPACE\Spoken-Keyword-Spotting\src\model_train.py:141) ]] [Op:__inference_predict_function_29742]

Function call stack: predict_function

Process finished with exit code 1

vineeths96 commented 3 years ago

I believe the sampling rate should be the same as the one used for training.

himrlawrrence commented 3 years ago

hi vinneth, how to train other key word? thanks, JFU