mdeff / fma

FMA: A Dataset For Music Analysis
https://arxiv.org/abs/1612.01840
MIT License
2.2k stars 432 forks source link

Fix baselines.py #43

Closed JustinKavalan closed 3 years ago

JustinKavalan commented 4 years ago

This pull requests address issue #15 (which was related to corrupt files / time data mismatch) by throwing out corrupt files when being loaded.

Any help in testing master and code revisions/suggestions are appreciated.

JustinKavalan commented 4 years ago

From the old pr: https://github.com/mdeff/fma/pull/42

Thanks for the PR! It was a poor decision of mine to mix up the code for the creation of an hypothetical next release and the maintenance of the usage code and doc for the latest released version. I've now corrected this mistake and made two branches: master works with the latest released data (i.e., rc1) and next contains code to prepare a next release. More at #41.

I've tried to rebase your modifications on top of master but made a wrong manipulation that closed the PR and removed my right to edit. :/

Can you reopen it, and base your changes on the current master? It should then work with the publicly available rc1 data. :)

I went ahead and rebased the commits, but I won't be able to get around to testing it / cleaning up unnecessary changes in baselines.py until this weekend. In the meantime, let me know if there are any comments that need to be addressed

mdeff commented 4 years ago

BTW, are the corrupted files part of the list of files shorter than 30s?

JustinKavalan commented 4 years ago

Yeah once it utils.py is working, baselines.py is redundant (although I do suggest enabling autoreload, this allow you run the import cell again if something goes wrong on an import). I'm running reverted baselines.py to the one in mdeff:master, but it seems to be erroring on too many files (see output below). Yeah most if not all of the files I saw earlier were the files shorter than 30s, there might've been couple others too. Most of the ones below aren't those files though.

Also thanks for handling cleaning the code in utils.py, I've been meaning to work on that for the past few weekends but haven't found the time.

Output ``` Ignoring /Users/justin/Downloads/fma_small/090/090693.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/090/090693.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/107/107745.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/107/107745.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/031/031160.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/031/031160.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/029/029680.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/029/029680.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/079/079571.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/079/079571.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/085/085402.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/085/085402.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Epoch 1/3 Ignoring /Users/justin/Downloads/fma_small/118/118733.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/118/118733.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/063/063468.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/063/063468.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/090/090800.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/090/090800.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/021/021131.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/021/021131.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/087/087969.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/087/087969.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/114/114657.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/114/114657.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/056/056842.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/056/056842.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/107/107854.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/107/107854.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/098/098968.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/098/098968.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/117/117462.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/117/117462.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). Ignoring /Users/justin/Downloads/fma_small/139/139486.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/139/139486.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1.). ```
JustinKavalan commented 4 years ago

Just figured out my environment, it appears to be working to some extent, but the behavior is different than before. It also does not seem to be loading samples during the epochs after an initial batch of epochs. I'm expecting the epoch dialog to give more information regarding progress, which it is not currently.

I don't think the code is loading the samples properly, it also errors for many files that aren't in the list of short files. Is it working on your machine? This all might be due to the fact that my machine doesn't have a gpu so I had to use tensorflow==1.1.0

mdeff commented 4 years ago

Thanks for testing. I didn't test myself (I was only editing on my laptop). The change I made in 09481c3226110754f95d7620819e232e3b3c10dc should not impact which files are skipped (exceptions are caught as you did). I wanted to avoid to reallocate self.X and self.Y with np.delete. Memory is allocated in __init__ and __next__ fills them (overwriting past data).

It also does not seem to be loading samples during the epochs after an initial batch of epochs.

What do you mean?

Looking at FfmpegLoader, I think it could error for files with a sampling rate different from 44.1 kHz.

JustinKavalan commented 4 years ago

Hm yeah I definitely agree with your changes, and I also am not sure where some of this behavior is coming from.

Basically, there'll be a batch of errors at the beginning and then that would be the entirety of the output. For example, I let baselines.py run overnight and what I attached above is ALL of the output from the night.

Although I'm also running this on my laptop (I have since shut down my gcp instance) so it might be related to the environment. I'll run my old code that I know works and compare the behavior tonight to see if it's something related to that.

mdeff commented 4 years ago

Weird. Thanks. Maybe there's something we miss from your old code.

JustinKavalan commented 4 years ago

Okay there were some idiosyncrasies stemming from my python version (tensorflow==1.0.1 is not available in python 3.6) and so I downgraded to python 3.5 and it seems to be working properly now. I had to change the print statement to support python3.5.

Also, the issue with the FFmpeg errors was caused because I forgot to set this line subset = tracks.index[tracks['set', 'subset'] <= 'medium'] to <= 'small'. Perhaps this should be documented a little bit more?

I also remembered that we toyed around with the architecture a bit in baselines.py and got the accuracy up to 35%. Should I commit those changes as well?

mdeff commented 4 years ago

TensorFlow 1.0.1 has a manylinux1 wheel for python 3.6. And it should work, as I developed this code with python 3.6. Or maybe you're not using linux? Anyway I'm fine with a 3.5-compatible print.

Also, the issue with the FFmpeg errors was caused because I forgot to set this line subset = tracks.index[tracks['set', 'subset'] <= 'medium'] to <= 'small'. Perhaps this should be documented a little bit more?

Do you mean that there are no errors with small? Are the errors with medium due to you having only downloaded fma_small.zip, hence missing many tracks from medium?

I also remembered that we toyed around with the architecture a bit in baselines.py and got the accuracy up to 35%. Should I commit those changes as well?

Please do as long as it's a simple architecture (baselines.ipynb is supposed to be a demo). But in a separate PR. ;)

JustinKavalan commented 4 years ago

And it should work, as I developed this code with python 3.6. Or maybe you're not using linux? Anyway I'm fine with a 3.5-compatible print.

Ah yeah I'm not on Linux

Do you mean that there are no errors with small? Are the errors with medium due to you having only downloaded fma_small.zip, hence missing many tracks from medium?

There are errors with small, but most of them are just the ones from the "list of files shorter than 30s". But also yes, the majority of the errors I was getting when this was set to medium are the tracks that are in medium but not in small.

For reference, here is the output when I run it for two epochs on fma_small:

fma_small output ``` Epoch 1/2 304/6400 [>.............................] - ETA: 819s - loss: 14.3097 - acc: 0.0987 Ignoring /Users/justin/Downloads/fma_small/098/098567.mp3 (error: could not broadcast input array from shape (13,44) into shape (13,2582)). 415/6400 [>.............................] - ETA: 796s - loss: 14.4439 - acc: 0.0940 Ignoring /Users/justin/Downloads/fma_small/017/017631.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 702/6400 [==>...........................] - ETA: 750s - loss: 14.1640 - acc: 0.1154 Ignoring /Users/justin/Downloads/fma_small/017/017637.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 1821/6400 [=======>......................] - ETA: 609s - loss: 14.2230 - acc: 0.1153 Ignoring /Users/justin/Downloads/fma_small/017/017635.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 2524/6400 [==========>...................] - ETA: 521s - loss: 14.2272 - acc: 0.1157 Ignoring /Users/justin/Downloads/fma_small/017/017636.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 2587/6400 [===========>..................] - ETA: 513s - loss: 14.2172 - acc: 0.1164 Ignoring /Users/justin/Downloads/fma_small/054/054576.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 3082/6400 [=============>................] - ETA: 447s - loss: 14.1930 - acc: 0.1181 Ignoring /Users/justin/Downloads/fma_small/054/054578.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 3849/6400 [=================>............] - ETA: 346s - loss: 14.1453 - acc: 0.1213 Ignoring /Users/justin/Downloads/fma_small/099/099134.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/099/099134.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 4376/6400 [===================>..........] - ETA: 274s - loss: 14.1435 - acc: 0.1216 Ignoring /Users/justin/Downloads/fma_small/098/098569.mp3 (error: could not broadcast input array from shape (13,132) into shape (13,2582)). 4759/6400 [=====================>........] - ETA: 222s - loss: 14.1330 - acc: 0.1223 Ignoring /Users/justin/Downloads/fma_small/055/055783.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 4982/6400 [======================>.......] - ETA: 192s - loss: 14.1058 - acc: 0.1240 Ignoring /Users/justin/Downloads/fma_small/133/133297.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/133/133297.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 5253/6400 [=======================>......] - ETA: 155s - loss: 14.1299 - acc: 0.1226 Ignoring /Users/justin/Downloads/fma_small/017/017634.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 5620/6400 [=========================>....] - ETA: 105s - loss: 14.1450 - acc: 0.1217 Ignoring /Users/justin/Downloads/fma_small/098/098565.mp3 (error: could not broadcast input array from shape (13,139) into shape (13,2582)). 6051/6400 [===========================>..] - ETA: 47s - loss: 14.1017 - acc: 0.1244 Ignoring /Users/justin/Downloads/fma_small/108/108925.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/108/108925.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 6402/6400 [==============================] - 868s - loss: 14.0917 - acc: 0.1250 /Users/justin/.pyenv/versions/3.5.9/lib/python3.5/site-packages/keras/engine/training.py:1573: UserWarning: Epoch comprised more than `samples_per_epoch` samples, which might affect learning results. Set `samples_per_epoch` correctly to avoid this warning. warnings.warn('Epoch comprised more than ' Epoch 2/2 80/6400 [..............................] - ETA: 808s - loss: 13.7004 - acc: 0.1500 Ignoring /Users/justin/Downloads/fma_small/108/108925.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/108/108925.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 639/6400 [=>............................] - ETA: 927s - loss: 13.9268 - acc: 0.1362 Ignoring /Users/justin/Downloads/fma_small/017/017636.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 654/6400 [==>...........................] - ETA: 926s - loss: 13.9277 - acc: 0.1361 Ignoring /Users/justin/Downloads/fma_small/055/055783.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 1101/6400 [====>.........................] - ETA: 850s - loss: 14.1143 - acc: 0.1244 Ignoring /Users/justin/Downloads/fma_small/098/098565.mp3 (error: could not broadcast input array from shape (13,139) into shape (13,2582)). 1276/6400 [====>.........................] - ETA: 811s - loss: 14.0481 - acc: 0.1285 Ignoring /Users/justin/Downloads/fma_small/054/054578.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 1291/6400 [=====>........................] - ETA: 809s - loss: 14.0222 - acc: 0.1301 Ignoring /Users/justin/Downloads/fma_small/098/098567.mp3 (error: could not broadcast input array from shape (13,44) into shape (13,2582)). 2362/6400 [==========>...................] - ETA: 611s - loss: 14.1059 - acc: 0.1249 Ignoring /Users/justin/Downloads/fma_small/098/098569.mp3 (error: could not broadcast input array from shape (13,132) into shape (13,2582)). 2809/6400 [============>.................] - ETA: 536s - loss: 14.1392 - acc: 0.1228 Ignoring /Users/justin/Downloads/fma_small/054/054576.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 2888/6400 [============>.................] - ETA: 523s - loss: 14.1096 - acc: 0.1247 Ignoring /Users/justin/Downloads/fma_small/099/099134.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/099/099134.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 2999/6400 [=============>................] - ETA: 504s - loss: 14.1087 - acc: 0.1247 Ignoring /Users/justin/Downloads/fma_small/017/017635.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 4886/6400 [=====================>........] - ETA: 218s - loss: 14.1099 - acc: 0.1244 Ignoring /Users/justin/Downloads/fma_small/017/017631.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 4933/6400 [======================>.......] - ETA: 212s - loss: 14.1127 - acc: 0.1243 Ignoring /Users/justin/Downloads/fma_small/133/133297.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/133/133297.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 5124/6400 [=======================>......] - ETA: 184s - loss: 14.0805 - acc: 0.1263 Ignoring /Users/justin/Downloads/fma_small/017/017637.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 5267/6400 [=======================>......] - ETA: 163s - loss: 14.0869 - acc: 0.1259 Ignoring /Users/justin/Downloads/fma_small/017/017634.mp3 (error: could not broadcast input array from shape (13,2581) into shape (13,2582)). 6370/6400 [============================>.] - ETA: 4s - loss: 14.1021 - acc: 0.1250 Ignoring /Users/justin/Downloads/fma_small/099/099134.mp3 (error: Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/099/099134.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1). 6401/6400 [==============================] - 914s - loss: 14.0967 - acc: 0.1253 ```

Please do as long as it's a simple architecture (baselines.ipynb is supposed to be a demo). But in a separate PR. ;)

Yeah it was mostly just a couple of fully connected layers added on the end, wasn't too bad at all.

As far as this goes, I think this is ready for a merge but I'll leave that with you as the repository owner.

mdeff commented 4 years ago

Thanks for providing the reported errors.

  1. Errors due to tracks known to be shorter than 30s:
    fma_small/099/099134.mp3 => Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/099/099134.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1 => track of length 0s
    fma_small/108/108925.mp3 => Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/108/108925.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1 => track of length 0s
    fma_small/133/133297.mp3 => Command '['ffmpeg', '-i', '/Users/justin/Downloads/fma_small/133/133297.mp3', '-f', 's16le', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '22050', '-']' returned non-zero exit status 1 => track of length 0s
    fma_small/098/098565.mp3 => could not broadcast input array from shape (13,139) into shape (13,2582) => track of length 1.6s
    fma_small/098/098567.mp3 => could not broadcast input array from shape (13,44) into shape (13,2582) => track of length 0.5s
    fma_small/098/098569.mp3 => could not broadcast input array from shape (13,132) into shape (13,2582) => track of length 1.5s
  2. Errors due to tracks falling short of 1 MFCC:
    fma_small/017/017631.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/017/017634.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/017/017635.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/017/017636.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/017/017637.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/054/054576.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/054/054578.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)
    fma_small/055/055783.mp3 => could not broadcast input array from shape (13,2581) into shape (13,2582)

The second kind of errors could be eliminated by truncating a bit (i.e., allowing a safety margin). Do you have by any chance the errors returned by the FfmpegLoader(sampling_rate=2000) (section 2.1) and FfmpegLoader(sampling_rate=16000) (section 2.2) (the current ones are for MfccLoader(), section 3.1)?

mdeff commented 4 years ago

Yeah it was mostly just a couple of fully connected layers added on the end, wasn't too bad at all.

Looking forward to a separate PR then. :)