IndoNLP / indonlu

The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
https://indobenchmark.com
Apache License 2.0
554 stars 193 forks source link

Loss Function for Fine-Tuning #24

Closed celine-setyawan closed 3 years ago

celine-setyawan commented 3 years ago

Hi, IndoNLU team,

Thanks for your amazing work! I'm currently working on my bachelor thesis with this IndoBERT for SequenceClassification Task. If I want to change my loss function for fine tuning, where or how can I do it?

From your tutorials here, I found out that you use CrossEntropy as the loss function for multiclass classification task (sentiment analysis in that case).

But when I want to dig more into the code, I can't find it. I can just find the CrossEntropyLoss() in:

none are for multi class classification.

The tutorials also mentioned :

"Cross entropy loss is calculated by comparing how well the probability distribution output by Softmax matches the one-hot-encoded ground truth label of the data."

But, the SmSA fine-tuning examples doesn't show anything about the ground truth being hot-encoded, they are being label-encoded instead. I also tried to print out the list_hyp and list_label, in case they are being one-hot encoded somewhere outside the code that I can see, but the outputs are just how the way they are (mapping from LABEL2INDEX). Meanwhile I suppose the SmSA label doesn't have an order or rank, right? So is my thesis task.

Thank you in advance! Regards, Celine.

SamuelCahyawijaya commented 3 years ago

Hi Chrysant Celine Setyawan, Sorry for the late reply,

But when I want to dig more into the code, I can't find it. I can just find

the CrossEntropyLoss() in:

- https://github.com/indobenchmark/indonlu/blob/master/modules/multi_label_classification.py

- https://github.com/indobenchmark/indonlu/blob/master/modules/word_classification.py

none are for BertForSequenceClassification.

BertForSequenceClassification is a model class defined from transformers package, you can check the package requirement on the requirements.txt file. You can check the source code for BertForSequenceClassification in https://huggingface.co/transformers/_modules/transformers/models/bert/modeling_bert.html#BertForSequenceClassification https://huggingface.co/transformers/_modules/transformers/models/bert/modeling_bert.html#BertForSequenceClassificationor from their github page on https://github.com/huggingface/transformers.

But, the SmSA fine-tuning examples

https://github.com/indobenchmark/indonlu/blob/master/examples/finetune_smsa.ipynb doesn't show anything about the ground truth being hot-encoded, they are being label-encoded instead. I also tried to print out the list_hyp and list_label, in case they are being one-hot encoded somewhere outside the code that I can see, but the outputs are just how the way they are (mapping from LABEL2INDEX)

For cross entropy loss, we do not need to perform one-hot encoding by ourself as it requires the input logits to be a FloatTensor with size of (N,C) and the label to be a LongTensor with size of (N). In this case you just need to pass the index of the label instead of the one-hot encoded representation of the label (you can check the PyTorch documentation for CrossEntropyLoss here https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html).

Meanwhile I suppose the SmSA label doesn't have an order or rank, right?

I suppose what you mean is ordinal. Yes, it is not ordinal, SmSA label is a nominal data which is used for classification of N distinct unordered classes

Btw, if you work on a custom loss function to replace Cross Entropy, you can also check this paper https://arxiv.org/pdf/2101.03841.pdf. I hope all the answers are clear and I hope the best for your thesis. Thank you!

-- Best Regards,

Samuel Cahyawijaya

Phone : +6281320773270 Email : @.***

On Thu, May 6, 2021 at 11:36 PM Chrysant Celine Setyawan < @.***> wrote:

Hi, IndoNLU team,

Thanks for your amazing work! I'm currently working on my bachelor thesis with this IndoBERT for SequenceClassification Task. If I want to change my loss function for fine tuning, where or how can I do it?

From your tutorials here https://indobenchmark.github.io/tutorials/pytorch/deep%20learning/nlp/2020/10/18/basic-pytorch-en.html#training-phase, I found out that you use CrossEntropy as the loss function for multiclass classification task (sentiment analysis in that case).

But when I want to dig more into the code, I can't find it. I can just find the CrossEntropyLoss() in:

- https://github.com/indobenchmark/indonlu/blob/master/modules/multi_label_classification.py

https://github.com/indobenchmark/indonlu/blob/master/modules/word_classification.py

none are for BertForSequenceClassification.

The tutorials also mentioned :

"Cross entropy loss is calculated by comparing how well the probability distribution output by Softmax matches the one-hot-encoded ground truth label of the data."

But, the SmSA fine-tuning examples https://github.com/indobenchmark/indonlu/blob/master/examples/finetune_smsa.ipynb doesn't show anything about the ground truth being hot-encoded, they are being label-encoded instead. I also tried to print out the list_hyp and list_label, in case they are being one-hot encoded somewhere outside the code that I can see, but the outputs are just how the way they are (mapping from LABEL2INDEX). Meanwhile I suppose the SmSA label doesn't have an order or rank, right? So is my thesis task.

Thank you in advance! Regards, Celine.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/indobenchmark/indonlu/issues/24, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVSC2Q3CK6RZLQPMMY4B2LTMKZQHANCNFSM44HJOEGA .

celine-setyawan commented 3 years ago

Hi, Kak Samuel, It's fine. It's fast enough for me.

Sorry, what I meant before was 'multi class classification' not 'BertForSequenceClassification'.

Ah I see, it's clear now . Thank you so much for the answers and pointers! Ah yes right, ordinal.

Thank you for your time Kak!