RaleLee / DialogueGCN

A preprocessing and training code for DialogueGCN on Dailydialogue and Mastodon dataset. Use Bert base to preprocess the sentences. Based on https://github.com/declare-lab/conv-emotion/tree/master/DialogueGCN
27 stars 3 forks source link

data preprocess #4

Closed ZKayell closed 2 years ago

ZKayell commented 2 years ago

Hello. When I trained the feature files generated by preprocess_dailydialog2.py, f1 score was 82.24, which doesn't make sense. Do you know what caused this? Much appreciated. The details are as follows. [epoch 1 train_loss 0.5816 train_acc 81.85 train_fscore 82.24 valid_loss 0.3864 valid_acc 88.09 valid_fscore 88.51 test_loss 0.5778 test_acc 81.67 test_fscore 82.29 time 473.1s]

RaleLee commented 2 years ago

Dailydialog dataset is an unbalance dataset. The 'no_emotion' label is about 83% in the whole emotion label. You can run preprocess_dailydialog2.py to check the all encoded emotion label. Notice that the preprocess emotional dict will change in different run! In preprocess_dailydialog2.py line 78, I added some print method:

for i, label in enumerate(all_emotion_labels):
        emotion_label_encoder[label] = i
        print(str(i) + " " + str(label))
        emotion_label_decoder[i] = label
        print(str(emotion_label_encoder[label]) + " " + str(emotion_label_decoder[i]))

So the answer is, in preprocessing time, the 'no_emotion' label matched to a specific encoded label x. When it comes to calculate f1 score, original paper masked this label to get a more specific result.

You can try train_daily_feature3.py, in line 193

    avg_fscore_w = round(f1_score(labels, preds, average='micro', labels=[0, 1, 2, 4, 5, 6]) * 100, 2)
    # Add precision and recall
    precision_w = round(precision_score(labels, preds, average='micro', labels=[0, 1, 2, 4, 5, 6]) * 100, 2)
    recall_w = round(recall_score(labels, preds, average='micro', labels=[0, 1, 2, 4, 5, 6]) * 100, 2)
    print('fscore: {}, precision: {}, recall: {}'.format(avg_fscore_w, precision_w, recall_w))

In my case, the 'no_emotion' label matched to encoded label 3. And don't forget to change the weight at line 356.

    loss_weights = torch.FloatTensor([1 / 0.0017,
                                      1 / 0.0034,
                                      1 / 0.1251,
                                      1 / 0.831,
                                      1 / 0.0099,
                                      1 / 0.0177,
                                      1 / 0.0112])
ZKayell commented 2 years ago

Thank you for your answer! So, should fewer labels have more loss weight? How do you get these weights?

RaleLee commented 2 years ago

Actually the settings in train_daily_feature3.py is following DialogRNN, which uses micro-f1 and masks the 'no-emotion' label. And the loss weights is also copied from the DialogRNN. The weight is in line 343, train_daily_feature3.py

    # 0-3-fear-0.0017
    # 1-2-disgust-0.0034
    # 2-4-happiness-0.1251
    # 3-0-no emotion-0.831
    # 4-1-anger-0.0099
    # 5-6-surprise-0.0177
    # 6-5-sadness-0.0112

First number denotes the encoded label after preprocess and the second number denotes the original label. Change the loss_weights according to your preprocess results.

ZKayell commented 2 years ago

I get it. Thanks again.