astorfi / lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Apache License 2.0
1.84k stars 321 forks source link

A problem of Multiclass Classification #17

Closed ghost closed 6 years ago

ghost commented 6 years ago

As far as I understtod, your code supports only a binary classification problem. I could not find any information in the paper regarding the classes (the "Words"/"Subjects" are the classes?). I am trying to use this for a multi-class problem. Since pairing has been done for frame sequences of each video (9 of them) with the corresponding speech spectrogram and MFEC features, I suppose there will be no problem if one changes the number of classes. When I change number of classes from 2 to 6, I get this error, can you help me?

Epoch 1, Minibatch 1 of 15 , Minibatch Loss= 1056.706787, EER= 0.50000, AUC= 0.33333, AP= 0.69683, contrib = 8 pairs
Epoch 1, Minibatch 2 of 15 , Minibatch Loss= 1793.572998, EER= 0.50000, AUC= 0.55000, AP= 0.61167, contrib = 9 pairs
Epoch 1, Minibatch 3 of 15 , Minibatch Loss= 1273.130249, EER= 0.50000, AUC= 0.62500, AP= 0.80417, contrib = 6 pairs
Epoch 1, Minibatch 4 of 15 , Minibatch Loss= 1280.513916, EER= 0.25000, AUC= 0.60714, AP= 0.81829, contrib = 11 pairs
Epoch 1, Minibatch 5 of 15 , Minibatch Loss= 1651.882568, EER= 0.40000, AUC= 0.60000, AP= 0.67778, contrib = 9 pairs
Epoch 1, Minibatch 6 of 15 , Minibatch Loss= 1395.890381, EER= 0.40000, AUC= 0.48000, AP= 0.53429, contrib = 10 pairs
Epoch 1, Minibatch 7 of 15 , Minibatch Loss= 1423.493164, EER= 0.27273, AUC= 0.63636, AP= 0.58000, contrib = 16 pairs
Epoch 1, Minibatch 8 of 15 , Minibatch Loss= 1248.631836, EER= 0.50000, AUC= 0.55000, AP= 0.61167, contrib = 9 pairs
Epoch 1, Minibatch 9 of 15 , Minibatch Loss= 1377.684937, EER= 0.50000, AUC= 0.54167, AP= 0.74385, contrib = 10 pairs
Epoch 1, Minibatch 10 of 15 , Minibatch Loss= 1460.154419, EER= 0.33333, AUC= 0.83333, AP= 0.88750, contrib = 7 pairs
Epoch 1, Minibatch 11 of 15 , Minibatch Loss= 1794.762451, EER= 0.40000, AUC= 0.33333, AP= 0.67771, contrib = 17 pairs
Epoch 1, Minibatch 12 of 15 , Minibatch Loss= 1140.301392, EER= 0.50000, AUC= 0.37500, AP= 0.36667, contrib = 6 pairs
Epoch 1, Minibatch 13 of 15 , Minibatch Loss= 1273.781738, EER= 0.66667, AUC= 0.47619, AP= 0.51664, contrib = 16 pairs
Epoch 1, Minibatch 14 of 15 , Minibatch Loss= 989.276489, EER= 0.50000, AUC= 0.58333, AP= 0.36667, contrib = 8 pairs
Epoch 1, Minibatch 15 of 15 , Minibatch Loss= 1625.663696, EER= 0.33333, AUC= 0.83333, AP= 0.95028, contrib = 11 pairs
TESTING: Epoch 1, Minibatch 1 of 3 
TESTING: Epoch 1, Minibatch 2 of 3 
Traceback (most recent call last):
  File "/media/Data/Scripts/lip-reading-deeplearning/code/training_evaluation/train.py", line 667, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "/media/Data/Scripts/lip-reading-deeplearning/code/training_evaluation/train.py", line 659, in main
    score_dissimilarity_vector[i * batch_k_validation:(i + 1) * batch_k_validation])
  File "/media/Data/Scripts/lip-reading-deeplearning/code/training_evaluation/roc_curve/calculate_roc.py", line 16, in calculate_eer_auc_ap
    AUC = metrics.roc_auc_score(label, -distance, average='macro', sample_weight=None)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/ranking.py", line 277, in roc_auc_score
    sample_weight=sample_weight)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/base.py", line 72, in _average_binary_score
TESTING: Epoch 1, Minibatch 3 of 3 
    raise ValueError("{0} format is not supported".format(y_type))
ValueError: multiclass format is not supported

Is this because of ROC calculation (line 659 of train.py) for multi-class classification? Can you tell me which parts need modification, maybe I missed something.

astorfi commented 6 years ago

The paper is for verification. It only deals with binary classification as a consequence. I do not understand what you meant about class definition. The point is when you create pairs for comparison, it is a binary classification problem. No matter who is saying what. So the subject and the spoken words do not matter.

Please read the associated paper for further details.

ghost commented 6 years ago

@asinatorfi I understand that this is a verification paper for binary classification. I already changed the metrics and my problem is solved. Sorry, I forgot to close the issue. Thanks a lot for your helps.

astorfi commented 6 years ago

My pleasure. Glad to see things are solved.