tyiannak / pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Apache License 2.0
5.76k stars 1.18k forks source link

Added max_files parameter to `extract_features_and_train` #312

Open KobaKhit opened 3 years ago

KobaKhit commented 3 years ago

Currently, extract_features_and_train needs a list of folder paths. It would be useful to be able to set how many files per folder to read at most. So I added max_files parameter with default 1000. Potentially randomly choosing those files would be another addition.

I tested it in a Kaggle notebook and it worked fine.

Motivation behind it was that there is a Birdcall Kaggle competition with 264 classes (folders) and ~100 files per class (folder). It took longer longer than 9 hours to train a model and the Kaggle notebook timed out. So I decided to train on smaller number of files per folder, i.e. undersample classes.

from pyAudioAnalysis import audioTrainTest as aT

# train classifier 
train_folder = '../input/birdsong-recognition/train_audio/'
mid_term_window_length = 2
mid_term_window_step = 1

# get audio file folders
class_paths = [train_folder + x for x in sorted(os.listdir(train_folder))]

model_name = "../input/bird-call-classification/svmSMtemp"
model_type = 'svm'

if not os.path.exists(model_name):
    model_name = "svmSMtemp"
    # train classifier using folders of audio files
    aT.extract_features_and_train(class_paths, 
                                  mid_term_window_length, 
                                  mid_term_window_step, 
                                  aT.shortTermWindow, 
                                  aT.shortTermStep, 
                                  model_type, 
                                  model_name, 
                                  False,
                                  max_files = 5)
Analyzing file 1 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC134874.mp3
Analyzing file 2 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135454.mp3
Analyzing file 3 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135455.mp3
Analyzing file 4 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135456.mp3
Analyzing file 5 of 5: ../input/birdsong-recognition/train_audio/aldfly/XC135457.mp3
Feature extraction complexity ratio: 28.2 x realtime
Analyzing file 1 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC133080.mp3
Analyzing file 2 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC139829.mp3
Analyzing file 3 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC139921.mp3
Analyzing file 4 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC155039.mp3
Analyzing file 5 of 5: ../input/birdsong-recognition/train_audio/ameavo/XC166076.mp3
Feature extraction complexity ratio: 27.5 x realtime
tyiannak commented 3 years ago

Thanx for the PR @KobaKhit It would be nice if (a) default value was -1 which indicates that no max files is used in the feature extraction process (b) random shuffling would also be parametrized (not by default set to true, as in many cases we need the feature extraction to take place in the file path order)