awni / speech

A PyTorch Implementation of End-to-End Models for Speech-to-Text
Apache License 2.0
747 stars 175 forks source link

TypeError: 'float' object cannot be interpreted as an index #32

Closed arattari closed 6 years ago

arattari commented 6 years ago

I'm trying to run the Seq2Seq model on the LibriSpeech corpus. I copied the config file for the TIMIT data and pointed it at Librispeech. Upon training...

(py27) [10:54 user@host:speech$] python train.py examples/librispeech/seq2seq_best.config
Traceback (most recent call last):
  File "train.py", line 145, in <module>
    run(config)
  File "train.py", line 80, in run
    start_and_end=data_cfg["start_and_end"])
  File "/people/user/speech/speech/loader.py", line 35, in __init__
    self.mean, self.std = compute_mean_std(audio_files[:max_samples])
  File "/people/user/speech/speech/loader.py", line 81, in compute_mean_std
    for af in audio_files]
  File "/people/user/speech/speech/loader.py", line 154, in log_specgram_from_file
    return log_specgram(audio, sr)
  File "/people/user/speech/speech/loader.py", line 165, in log_specgram
    detrend=False)
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/spectral.py", line 691, in spectrogram
    input_length=x.shape[axis])
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/spectral.py", line 1775, in _triage_segments
    win = get_window(window, nperseg)
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/windows/windows.py", line 2106, in get_window
    return winfunc(*params)
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/windows/windows.py", line 786, in hann
    return general_hamming(M, 0.5, sym)
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/windows/windows.py", line 1016, in general_hamming
    return general_cosine(M, [alpha, 1. - alpha], sym)
  File "/people/user/.conda/envs/py27/lib/python2.7/site-packages/scipy/signal/windows/windows.py", line 116, in general_cosine
    w = np.zeros(M)
TypeError: 'float' object cannot be interpreted as an index
(py27) [10:56 user@host:speech$] 

Any ideas, @awni?

arattari commented 6 years ago

Two typos in loader.py in log_specgram -- nperseg and noverlap should be directly cast to ints.