drscotthawley / panotti

A multi-channel neural network audio classifier using Keras
MIT License
271 stars 72 forks source link

Cross Validation #60

Closed ngragaei closed 4 years ago

ngragaei commented 4 years ago

Excuse me How to use cross validation?

drscotthawley commented 4 years ago

Hi, Panotti expects input data in the form of Training and Testing data directories. While training, it will also do a 80/20 Train/Val split for a validation set.

It currently does not have k-fold cross-validation implemented, but it wouldn't be hard to do with an external loop around the training. Let me work on that and perhaps I can accommodate that.

drscotthawley commented 4 years ago

Just to clarify: Panotti already did simple "holdout" cross-validation, using by default, 20% of the Training set. There can also be a totally separate Testing set that is never seen until the end.

But now!

As of the latest commit, I have added what I'll call 'rudimentary k-fold cross-validation', in which the Validation set will change in each iteration. The size of the Test/Val split is still set by the --val command-line argument (which defaults to 0.2), but what's new is that there is also a -k or --kfold argument, which set the number of times a Training loop will be performed with a completely different Validation set each time. kfold defaults to 1 so that the original panotti behavior is preserved. The kfold value can be a maximum of 1/val, i.e. typically a maximum value of 5. Otherwise you start to repeat data. If 1 < kfold < 1/val, then...it amounts to doing the Train/Val split kfold times, with the size of the split given by val.

How's that? Good enough for now?

Also, note that checkpointing is disabled until the very last iteration of the k-fold loop, because otherwise...well it just had to be that way.

ngragaei commented 4 years ago

Many Thanks! I appreciate your effort. I will try it.

ngragaei commented 4 years ago

I have another question please. I need to add delta features to melgram. Is this right? delta = librosa.feature.delta(melgram1,mode='nearest') delta2 = librosa.feature.delta(melgram1,order = 2,mode='nearest') melgram1.append(np.c_[melgram1,delta,delta2]) melgram = np.sum(melgram1[2], delta[2], delta2[2])[:,:,:,:]

drscotthawley commented 4 years ago

Uh.. I don't know, sorry. That's a neat idea, but beyond the scope of the project. Check with librosa folks.
Closing this issue since CV is working.