Closed dpoljak closed 3 years ago
Try deleting the whole dataset (or maybe that year where you get the error) and start a fresh download. Worked in my case #25
I have deleted the whole repo, redownloaded the repo, installed deps in a fresh conda environment. Redownloaded all the files today and am still hitting the same error messages :confused:
formats: can't open output file '../audios/voxpopuli_asr/transcribed_data/hr/2019/20190214-0900-PLENARY-hr_20190214-15:48:48_0.ogg': Invalid argument
From what I'm reading it might be due to my SoX installation, I'll try to look into it and circle back here if and when I solve it. However any and all pointers and suggestions are welcome :pray:
Okay I solved this. After testing out basic sox output for files I found out that it breaks on the name because it contains :
which isn't supported on NTFS partitions. Moving the data to an ext4 partition and running the code with that root solved the error.
Hello, I'm trying to acces the croatian transcribed ASR dataset, but I'm having trouble, similarly to #25 i get a broken pipe error, but it is preceeded by formats errors for opening the transcribed_data files
Running on any language target I recieve the same errors, in the following excerpt is the line for english
My environment is in Python 3.8.10 on Manjaro 21.0.7 with dependencies: