DemisEom / SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Apache License 2.0
641 stars 136 forks source link

ValueError: slice index 2 of dimension 0 out of bounds #19

Open KiAlexander opened 5 years ago

KiAlexander commented 5 years ago

when I try the audio file SpecAugment,something wrong happaned, do anyone have an idea to solve the following problem?

Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import librosa from specAugment import spec_augment_tensorflow audio, sampling_rate = librosa.load("./data/61-70968-0002.wav") mel_spectrogram = librosa.feature.melspectrogram(y=audio,sr=sampling_rate,n_mels=256,hop_length=128,fmax=8000) warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram) "

warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram) Traceback (most recent call last): File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1864, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 2 of dimension 0 out of bounds. for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [2], [1], [1], [1] and with computed input tensors: input[1] = <2>, input[2] = <3>, input[3] = <1>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/yjm/SpecAugment/SpecAugment/spec_augment_tensorflow.py", line 165, in spec_augment warped_mel_spectrogram = sparse_warp(mel_spectrogram) File "/home/yjm/SpecAugment/SpecAugment/spec_augment_tensorflow.py", line 63, in sparse_warp n, v = fbank_size[1], fbank_size[2] File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 680, in _slice_helper name=name) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 846, in strided_slice shrink_axis_mask=shrink_axis_mask) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 9989, in strided_slice shrink_axis_mask=shrink_axis_mask, name=name) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op op_def=op_def) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2027, in init control_input_ops) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1867, in _create_c_op raise ValueError(str(e)) ValueError: slice index 2 of dimension 0 out of bounds. for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [2], [1], [1], [1] and with computed input tensors: input[1] = <2>, input[2] = <3>, input[3] = <1>. "

AASHISHAG commented 5 years ago

@kimchi88 Did you have this issue. Please help on this error.

@KiAlexander : Have you resolved the issue?

AASHISHAG commented 5 years ago

@jybaek @edwardyoon @DemisEom : Hello, please help on the below issue:

I am trying to augment the wav files using the below code and getting this error when spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram) is called.

I have tried with numrous tensorflow versions like 2.0.0, 1.15.0, 1.14.0, 1.13.1etc. Please advise.

import os
import glob
import scipy
import librosa
import numpy as np
from specAugment import spec_augment_tensorflow

mozilla_augmented = '/mozilla_augmented/clips/*.wav'
entries = []
for audio_path in glob.iglob(mozilla_augmented):
    entries.append(os.path.basename(audio_path))

mozilla = '/mozilla/clips/*.wav'
for audio_path in glob.iglob(mozilla):
    print(audio_path)
    if os.path.basename(audio_path) not in entries:
        audio, sampling_rate = librosa.load(audio_path)
        mel_spectrogram = librosa.feature.melspectrogram(y=audio,
                                                     sr=sampling_rate,
                                                     n_mels=256,
                                                     hop_length=128,
                                                     fmax=8000)
        warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram)
        wav = librosa.feature.inverse.mel_to_audio (M=warped_masked_spectrogram, hop_length=128, sr=sampling_rate)
        wav *= 32767 / max (0.01, np.max(np.abs(wav)))
        scipy.io.wavfile.write (audio_path, 16000, wav.astype(np.int16))
/home/LTLab.lan/agarwal/german-speech-corpus/mozilla/clips/de33d2bc4b691b6bc066024f2918c4f3270cdbcb98cc8f49404534378617760f88518f80a4c3a88ac3290642bffc2e2db4194adf928a1c7da61354c2dff1545a.wav
2019-11-29 14:18:41.240183: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-11-29 14:18:41.305812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Quadro RTX 6000 major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:18:00.0
2019-11-29 14:18:41.306187: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-29 14:18:41.307756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-29 14:18:41.309258: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-29 14:18:41.309662: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-29 14:18:41.311416: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-29 14:18:41.312657: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-29 14:18:41.316551: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-29 14:18:41.319866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-29 14:18:41.320310: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-11-29 14:18:41.338433: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199975000 Hz
2019-11-29 14:18:41.341381: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x774fbd0 executing computations on platform Host. Devices:
2019-11-29 14:18:41.341407: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-11-29 14:18:41.479945: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x77b2db0 executing computations on platform CUDA. Devices:
2019-11-29 14:18:41.480011: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Quadro RTX 6000, Compute Capability 7.5
2019-11-29 14:18:41.482012: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Quadro RTX 6000 major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:18:00.0
2019-11-29 14:18:41.482089: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-29 14:18:41.482109: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-11-29 14:18:41.482126: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-11-29 14:18:41.482144: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-11-29 14:18:41.482161: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-11-29 14:18:41.482178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-11-29 14:18:41.482196: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-29 14:18:41.485245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-29 14:18:41.485289: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-11-29 14:18:41.487635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-29 14:18:41.487649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-11-29 14:18:41.487658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-11-29 14:18:41.490847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22817 MB memory) -> physical GPU (device: 0, name: Quadro RTX 6000, pci bus id: 0000:18:00.0, compute capability: 7.5)
2019-11-29 14:18:42.512464: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index 2 of dimension 0 out of bounds.
Traceback (most recent call last):
  File "<stdin>", line 10, in <module>
  File "/media/data/LTLab.lan/agarwal/speech-data-augmentation/spec_augment_tensorflow/specAugment/spec_augment_tensorflow.py", line 166, in spec_augment
    warped_mel_spectrogram = sparse_warp(mel_spectrogram)
  File "/media/data/LTLab.lan/agarwal/speech-data-augmentation/spec_augment_tensorflow/specAugment/spec_augment_tensorflow.py", line 64, in sparse_warp
    n, v = fbank_size[1], fbank_size[2]
  File "/media/data/LTLab.lan/agarwal/python-environments/letsee2/lib/python3.5/site-packages/tensorflow_core/python/ops/array_ops.py", line 813, in _slice_helper
    name=name)
  File "/media/data/LTLab.lan/agarwal/python-environments/letsee2/lib/python3.5/site-packages/tensorflow_core/python/ops/array_ops.py", line 979, in strided_slice
    shrink_axis_mask=shrink_axis_mask)
  File "/media/data/LTLab.lan/agarwal/python-environments/letsee2/lib/python3.5/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 10372, in strided_slice
    _six.raise_from(_core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 2 of dimension 0 out of bounds. [Op:StridedSlice] name: strided_slice/
AASHISHAG commented 5 years ago

@jybaek @edwardyoon @DemisEom : Please advise on the above issue.

krupalraj commented 4 years ago

when I try the audio file SpecAugment,something wrong happaned, do anyone have an idea to solve the following problem?

Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import librosa from specAugment import spec_augment_tensorflow audio, sampling_rate = librosa.load("./data/61-70968-0002.wav") mel_spectrogram = librosa.feature.melspectrogram(y=audio,sr=sampling_rate,n_mels=256,hop_length=128,fmax=8000) warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram) "

warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram) Traceback (most recent call last): File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1864, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 2 of dimension 0 out of bounds. for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [2], [1], [1], [1] and with computed input tensors: input[1] = <2>, input[2] = <3>, input[3] = <1>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/yjm/SpecAugment/SpecAugment/spec_augment_tensorflow.py", line 165, in spec_augment warped_mel_spectrogram = sparse_warp(mel_spectrogram) File "/home/yjm/SpecAugment/SpecAugment/spec_augment_tensorflow.py", line 63, in sparse_warp n, v = fbank_size[1], fbank_size[2] File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 680, in _slice_helper name=name) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 846, in strided_slice shrink_axis_mask=shrink_axis_mask) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 9989, in strided_slice shrink_axis_mask=shrink_axis_mask, name=name) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, kwargs) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op op_def=op_def) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2027, in init** control_input_ops) File "/home/yjm/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1867, in _create_c_op raise ValueError(str(e)) ValueError: slice index 2 of dimension 0 out of bounds. for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [2], [1], [1], [1] and with computed input tensors: input[1] = <2>, input[2] = <3>, input[3] = <1>. "

reshape spectrogram shape to [batch_size, time, frequency, 1] shape = mel_spectrogram.shape mel_spectrogram = np.reshape(mel_spectrogram, (-1, shape[0], shape[1], 1))

before calling warped_masked_spectrogram = spec_augment_tensorflow.spec_augment(mel_spectrogram=mel_spectrogram)

huiMM commented 4 years ago

@krupalraj, how can I convert the warped_masked_spectrogram to .wav file? if I use reshape before spec_augment?

@AASHISHAG have you realize re-generate the wav file successfully?

Aishaj commented 2 years ago

Hi,

I wonder if anyone managed to regenerate the wav file from the warped_masked_spectrogram. I tried the below code

` warped_masked_spectrogram = warped_masked_spectrogram.numpy() warped_masked_spectrogram = warped_masked_spectrogram.reshape( (warped_masked_spectrogram.shape[1], warped_masked_spectrogram.shape[2])) warped_masked_spectrogram = tf.transpose(warped_masked_spectrogram) warped_masked_spectrogram = warped_masked_spectrogram.numpy().reshape(1,-1)

for spec in warped_masked_spectrogram:
     wav=librosa.feature.inverse.mel_to_audio(spec)
    for feature in model(wav)[1]:
        aug_samples.append(feature)

....`

Howvere I'm getting the below error

0%| | 0/91 [02:32<?, ?it/s] Traceback (most recent call last): File "optuna.py", line 638, in <module> main(sys.argv[1:]) File "optuna.py", line 553, in main X_train, y_train , X_test , y_test = create_dataset(path) File "optuna.py", line 76, in create_dataset extract_features(wav, cls, model, samples , labels , aug_samples , aug_labels ) File "optuna.py", line 512, in extract_features wav=librosa.feature.inverse.mel_to_audio(spec) File "C:\Users\ash_j\anaconda3\envs\yamnet\lib\site-packages\librosa\feature\inverse.py", line 183, in mel_to_audio pad_mode=pad_mode, File "C:\Users\ash_j\anaconda3\envs\yamnet\lib\site-packages\librosa\core\spectrum.py", line 2404, in griffinlim length=length, File "C:\Users\ash_j\anaconda3\envs\yamnet\lib\site-packages\librosa\core\spectrum.py", line 381, in istft n_frames = stft_matrix.shape[1] IndexError: tuple index out of range