felixbur / nkululeko

Machine learning speaker characteristics
MIT License
31 stars 5 forks source link

BUG: demo module return nan for model other than wav2vec2 variants in finetuning #150

Open bagustris opened 1 month ago

bagustris commented 1 month ago

Although we can fill pretrained_model using other than wav2vec2 variants, and the training process success, the demo using fine-tuned model causes nan in the logits. I have experienced this in the past and thought it had been solved.

Example of data

Training stage

$ python3 -m nkululeko.nkululeko --config test_bagus/exp_emodb_finetune_wavlm_base.ini 
DEBUG: nkululeko: running finetuned_wavlm_base_7 from config test_bagus/exp_emodb_finetune_wavlm_base.ini, nkululeko version 0.88.11
DEBUG: experiment: value for type not found, using default: audformat
...
{'loss': 0.5462, 'grad_norm': 1.6567870378494263, 'learning_rate': 4.5454545454545455e-06, 'epoch': 12.0}                                                       
{'eval_loss': 1.0810825824737549, 'eval_UAR': 0.75, 'eval_ACC': 0.8014705882352942, 'eval_runtime': 0.5392, 'eval_samples_per_second': 252.217, 'eval_steps_per_second': 9.273, 'epoch': 12.57}                                                                                                                                 
{'loss': 0.5417, 'grad_norm': 1.396345615386963, 'learning_rate': 0.0, 'epoch': 12.57}                                                                          
{'train_runtime': 50.9932, 'train_samples_per_second': 87.58, 'train_steps_per_second': 0.431, 'train_loss': 0.8906121253967285, 'epoch': 12.57}                
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:50<00:00,  2.32s/it]
DEBUG: model: saved best model to /tmp/results/finetuned_wavlm_base_22/models/run_0/torch
DEBUG: reporter: value for name is not found, using default: emodb_emotion_finetune                 
DEBUG: reporter: plotted epoch progression to /tmp/results/finetuned_wavlm_base_22/./images/run_0/emodb_emotion_finetune_epoch_progression.png
DEBUG: modelrunner: run: 0 epoch: 22: result: test: 0.769 UAR
DEBUG: modelrunner: plotting confusion matrix to emodb_emotion_finetune_0_022_cnf
DEBUG: reporter: Saved confusion plot to /tmp/results/finetuned_wavlm_base_22/./images/run_0/emodb_emotion_finetune_0_022_cnf.png
DEBUG: reporter: Best score at epoch: 0, UAR: .768, (+-.723/.818), ACC: .816
DEBUG: reporter: labels: ['anger', 'sadness', 'neutral', 'happiness']
DEBUG: reporter: result per class (F1 score): [0.833, 0.294, 0.941, 0.982] from epoch: 22
WARNING: experiment: Save experiment: Can't pickle the trained model so saving without it. (it should be stored anyway)
DEBUG: experiment: Done, used 194.498 seconds
DONE

Inference/demo

python3 -m nkululeko.demo --config test_bagus/exp_emodb_finetune_wavlm_base.ini --file data/test/audio/03a
...
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
/home/bagus/miniconda3/envs/nkululeko/lib/python3.9/site-packages/torch/nn/functional.py:5076: UserWarning: Support for mismatched key_padding_mask and attn_mask is deprecated. Use same type for both instead.
  warnings.warn(
ERROR: demo: NaN value in pipeline output for file: data/test/audio/03a01Nc.wav

It seems that demo module from fine-tuning only works for wav2vec2 variants, including audmodel. Previously it works on other models, e.g. the following model: wavlm_finetuned_emodb.

Test demo module for finetune model type:

model works?
default (wav2vec2 robust) yes
hubert-large-ll60k no
wavlm-base no
wavlm-base-plus no
wavlm-large no
audeering yes

So, although there are no errors during the training process (nkululeko.nkululeko) and the performance score for the test set can be obtained, maybe some configuration for Hubert and wavlm differs from wav2vec2 variants (or other library updates/upgrades).

felixbur commented 1 month ago

thanks. I'm on vacation until August 18th and can try to look into it after