Im not sure if it's the right call, but I have encountered issues with some samples when working on fma_small
Reproduction:
corrupted_indicies = []
for i, audio_id in tqdm(enumerate(train)):
try:
# Load audio file
y, sr = librosa.load(get_audio_path(AUDIO_DIR, audio_id))
except:
print("There was a problem with ", audio_id)
corrupted_indicies.append(i)
Where train variable holds IDs of all fma_small samples labelled as "train".
For some samples librosa.load fails to load:
y, sr = librosa.load(get_audio_path(AUDIO_DIR, 133297))
Produces:
LibsndfileError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py](https://localhost:8080/#) in load(path, sr, mono, offset, duration, dtype, res_type)
174 try:
--> 175 y, sr_native = __soundfile_load(path, offset, duration, dtype)
176
7 frames
LibsndfileError: Error opening 'fma_small/133/133297.mp3': File does not exist or is not a regular file (possibly a pipe?).
During handling of the above exception, another exception occurred:
NoBackendError Traceback (most recent call last)
<decorator-gen-119> in __audioread_load(path, offset, duration, dtype)
[/usr/local/lib/python3.10/dist-packages/audioread/__init__.py](https://localhost:8080/#) in audio_open(path, backends)
130
131 # All backends failed!
--> 132 raise NoBackendError()
NoBackendError:
When I check my colab session, I can see that the mp3 file is actually present in the specified location.
Downloaded file is surprisingly small, and playing this on my audio player, crashes it.
Problem does not occur for most of the files.
Test and validation subsets are clean.
Im not sure if it's the right call, but I have encountered issues with some samples when working on fma_small
Reproduction:
Where
train
variable holds IDs of all fma_small samples labelled as "train". For some samples librosa.load fails to load:Produces:
When I check my colab session, I can see that the mp3 file is actually present in the specified location. Downloaded file is surprisingly small, and playing this on my audio player, crashes it.
Problem does not occur for most of the files. Test and validation subsets are clean.
Problematic Ids that I have spotted:
133297, 108925, 99134