Open-Speech-EkStep / ULCA-asr-dataset-corpus

Creative Commons Attribution 4.0 International
38 stars 16 forks source link

File format issue in Malayalam dataset #1

Open kavyamanohar opened 2 years ago

kavyamanohar commented 2 years ago

The Malayalam dataset in the categories: dd_malayalam, joshtalks and The_Cue can not be played or processed due to some file formatting issue.

It gives the following error when trying to process it using soxi command:

soxi FAIL formats: can't open input file `175_2166file-idj5GxKHFgrNs.wav': WAVE: RIFF header not found