[Bug] weights defined but not passed to train_epoch and val_epoch in finetune_emotion.py

FangxuY commented 9 months ago

Hi team, I met some problem when using lora to finetune my own dataset, Here is the command I used: CUDA_VISIBLE_DEVICES=1, taskset -c 1-60 python3 finetune_emotion.py --pretrain_model whisper_tiny --dataset commsense --learning_rate 0.0005 --num_epochs 30 --finetune_method lora --lora_rank 16

Error log:

Traceback (most recent call last):
  File "/home/peft-ser/experiment/finetune_emotion.py", line 242, in <module>
    train_result = train_epoch(
  File "/home/peft-ser/experiment/finetune_emotion.py", line 92, in train_epoch
    loss = criterion(outputs, y)
  File "/home/anaconda3/envs/efficient-ser/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/anaconda3/envs/efficient-ser/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 1164, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "/home/anaconda3/envs/efficient-ser/lib/python3.9/site-packages/torch/nn/functional.py", line 3014, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: weight tensor should be defined either for all 4 classes or no classes but got weight tensor of shape: [2]

Found bug:

weights are defined and created in L170, but not passed to train_epoch and validata_epoch function. I wander how the weights parameter worked in train_epoch and validata_epoch function.

Thanks for your help! @tiantiaf0627

tiantiaf0627 commented 9 months ago

Hello @FangxuY,

Thanks for pointing this out, I think the bug was caused by the Python version issue, where in Python 3.9 and above, the functions would access the variables declared in "if name == 'main':", so I did not identify this first time, it should be a good practice to pass this as an input.

Increasingly, the weights here is to balance the loss in model updates with respect to class distribution, a commonly used technique in deep learning and machine learning to deal with class imbalance challenges.

Thanks again for pointing this out, and I have updated the training code to reflect this.

Thanks,

Tiantian

FangxuY commented 9 months ago

Thanks Tian! Maybe it is a bug from my code, as I add my own dataset to finetune. So the error is not because of weights parameter but to adjust some parameter of weight tensor to suit my dataset. Anyway, thanks for your quick and good answer!

FangxuY commented 4 months ago

Hi Tian, it's me again. Sorry to bother you. I have two more questions for the code.

num_class In https://github.com/usc-sail/peft-ser/blob/main/experiment/finetune_emotion.py#L212, I guess the num_class suppose to be the class of emotion (for different dataset, the number is different). I try to use your code to my new dataset(only have 2 class of emotion), always occure error as weight tensor should be defined either for all 4 classes or no classes but got weight tensor of shape: [2]. So I guess that the num_class did not pass to the model, although I have already set if args.dataset in ["my_dataset"]: num_class = 2. I try to search num_class to find where it is used, it seems that it was passed to downstream_models.py. So could you please give me some instruction about how to pass the num_class to the model?
downstream_model parameter In https://github.com/usc-sail/peft-ser/blob/main/utils/utils.py#L209, it is set defaulted to rnn but in https://github.com/usc-sail/peft-ser/blob/main/experiment/finetune_emotion.py#L210, the if statement is to judge cnn, so I wonder if it is a bug?

Thanks a lot!

tiantiaf0627 commented 4 months ago

Hello @FangxuY, for your first case, I think for each model I have defined, for example, model = WavLMWrapper(args).to(device), you can pass the number of classes by setting output_class_num=2. sorry that I forgot to delete the num_class, as it is not really used.

Also, sorry for the confusion for the RNN or CNN part, this is argument is not really used as stated that you can pass the number of class into the model definition.

tiantiaf0627 commented 4 months ago

I also clean up the code, and sorry for the confusion.

FangxuY commented 4 months ago

Thanks Tian, it works and solve my problem!

usc-sail / peft-ser

[Bug] weights defined but not passed to train_epoch and val_epoch in finetune_emotion.py #1