srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
822 stars 342 forks source link

ERROR compute-fbank-feats:Read() #120

Closed star633669 closed 7 years ago

star633669 commented 7 years ago

I want to include LibriSpeech data into training. I just copy wav.scp from kaldi toolkit after prepare_data. How can I correct this??

ERROR (compute-fbank-feats:Read():wave-reader.cc:142) WaveData: can read only PCM data, audio_format is not 1: 65534

WARNING (compute-fbank-feats:Read():feat/wave-reader.h:149) Exception caught in WaveHolder object (reading).

WARNING (compute-fbank-feats:LoadCurrent():util/kaldi-table-inl.h:232) TableReader: failed to load object from 'flac -c -d -s /nfs/GPU/home/speech/kaldi-data/librispeech/LibriSpeech/train-clean-100/27/124992/27-124992-0010.flac |'

WARNING (compute-fbank-feats:Close():kaldi-io.cc:446) Pipe flac -c -d -s /nfs/GPU/home/speech/kaldi-data/librispeech/LibriSpeech/train-clean-100/27/124992/27-124992-0010.flac | had nonzero return status 13

I will appreciate any help Thanks

star633669 commented 7 years ago

This problem is resolved. I just convert all ".flac" to ".wav" with ffmpeg. But I got another error.

ERROR (compute-fbank-feats:Read():wave-reader.cc:198) Expected 168160 bytes in RIFF chunk, but after first data block there will be 38 + 162738 bytes (we do not support reading multiple data chunks).

WARNING (compute-fbank-feats:Read():feat/wave-reader.h:149) Exception caught in WaveHolder object (reading).

WARNING (compute-fbank-feats:LoadCurrent():util/kaldi-table-inl.h:232) TableReader: failed to load object from edu/1050630-1805/1050630-1805-011.wav

This my wav.scp file 1050114-1805-001 edu/1050114-1805/1050114-1805-001.wav 1050114-1805-002 edu/1050114-1805/1050114-1805-002.wav 1050114-1805-003 edu/1050114-1805/1050114-1805-003.wav ...

I already to run utils/fix_data_dir.sh script. fix_data_dir.sh: kept all 100 utterances. fix_data_dir.sh: old files are kept in data/test/.backup @riebling @fmetze

I will appreciate any help Thanks

fmetze commented 7 years ago

I think under some circumstances, ffmpeg will produce wav files with an extra chunk, that is essentially just a comment. most software handles this ok, but not kaldi. the solution is to pass these files through sox, i.e. using sox to convert the incompatible wav files to compatible wav files, without any other changes.

star633669 commented 7 years ago

Thank you for your reply and good suggestions. This problem is resolved.