Hi, I am trying to reproduce the results presented in the paper. However, I don't get anywhere near these (66 F1 vs. 68 F1 on meld). I would rather not use the domain module, and therefore I want to get the results as presented in the ablation study.
I noticed that there is a lot of commented code in main_new.py and train_and_inference_Uni.sh, resulting in some unused arguments. Especially the emotion_prediction flag in main_new.py which does a case splitting during training, resulting in the same outcomes because of commented code (ref).
In the paper, you mentioned α as a scalar for the influence of the emotion prediction task, but I cannot find it anywhere in the code. There is a β of 0.1, but it remains unused.
As a result, I'm unsure whether what I'm doing is correct or if I'm overlooking something.
My training routine is the following:
Pretrain LLaMA2-7b-chat-hf with the speaker identification task.
Normal training including the emotion prediction as an auxiliary task
I also found that there is no improvement on the test set after ca. 3 epochs.
This is my shell script where I copied everything I needed to keep a better overview.
I only changed one thing in main_new.py: I use bfloat16 for training rather than fp16, since LLaMA2 seems to have problems with fp16 conversion resulting in NaN for the loss (see issue)
GPUs: 2 x RTX A6000 (48Gb)
Did I misunderstand something in my setup? I would greatly appreciate your help!
Hi, I am trying to reproduce the results presented in the paper. However, I don't get anywhere near these (66 F1 vs. 68 F1 on meld). I would rather not use the domain module, and therefore I want to get the results as presented in the ablation study.
I noticed that there is a lot of commented code in main_new.py and train_and_inference_Uni.sh, resulting in some unused arguments. Especially the emotion_prediction flag in main_new.py which does a case splitting during training, resulting in the same outcomes because of commented code (ref). In the paper, you mentioned α as a scalar for the influence of the emotion prediction task, but I cannot find it anywhere in the code. There is a β of 0.1, but it remains unused.
As a result, I'm unsure whether what I'm doing is correct or if I'm overlooking something.
My training routine is the following:
I also found that there is no improvement on the test set after ca. 3 epochs.
This is my shell script where I copied everything I needed to keep a better overview.
I only changed one thing in main_new.py: I use bfloat16 for training rather than fp16, since LLaMA2 seems to have problems with fp16 conversion resulting in NaN for the loss (see issue) GPUs: 2 x RTX A6000 (48Gb)
Did I misunderstand something in my setup? I would greatly appreciate your help!