fastaudio / fastai_audio

[DEPRECATED] 🔊️ Audio with fastaiv1
MIT License
160 stars 49 forks source link

AudioList.from_df() filters out all files #66

Closed maxliuverkada closed 4 years ago

maxliuverkada commented 4 years ago

When running audios = AudioList.from_df(df=df, path=PATH_AUDIO, cols='filename') I get the following output: "Filtered out 2000 empty files." It seems like from_df is filtering out my entire dataset as empty files. Not sure why this is happening. I tried audios = AudioList.from_folder(path=PATH_AUDIO) which works fine. This is an ok workaround since my files are labelled using the filename as well has the csv labels column (even though labelling using regex is more annoying than using a labels column in a df). I figure this would be a much more annoying issue if filenames are not labelled though.

For reference, I am running this on Google Colab.

mogwai commented 4 years ago

Can you share the head of the df and the folder structure that you are using?

maxliuverkada commented 4 years ago

Screen Shot 2020-07-15 at 1 35 03 PM Screen Shot 2020-07-15 at 1 36 58 PM For the folder structure, the colab notebook I am using is in fastai_audio_2 and my wav files are in fastai_audio_2/data/ESC-50/ESC-50-master/audio. I was able to get from_df to work only after I cd into the audio folder where my wav files are. If I am in the fastai_audio_2 folder, the function will filter out all of my files as empty files. I'm guessing the path that gets built in from_df and passed into the filtering function for filtering isn't the one I am passing in to from_df. I think it would be nice to print out the "incorrect" path rather than just saying it was empty (so the user knows where it went wrong).

maxliuverkada commented 4 years ago

For reference, PATH_AUDIO = fastai_audio_2/data/ESC-50/ESC-50-master/audio.

mogwai commented 4 years ago

That's a good suggestion. I'll try to recreate you're issue an implement a warning.

mogwai commented 4 years ago

It will fail now if the path doesn't exist