bhavitvyamalik / DialogTag

A python library to classify dialogue tag.
https://pypi.org/project/DialogTag/
MIT License
23 stars 6 forks source link

Details of pretrained model #6

Open angoodkind opened 2 years ago

angoodkind commented 2 years ago

Can you provide further details about the pretrained model? Is it using any context, etc. for the utterances? It would be really helpful if there was a paper I could point to.

bhavitvyamalik commented 2 years ago

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

If you are planning to work on it, you can look at existing solutions here (https://nlpprogress.com/english/dialogue.html)

angoodkind commented 2 years ago

So are you just classifying utterances based on the semantics of the utterances itself, in isolation? Or is any consideration of prior contest taken into account?

On Wed, Mar 16, 2022 at 7:03 AM Bhavitvya Malik @.***> wrote:

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

If you are planning to work on it, you can look at existing solutions here (https://nlpprogress.com/english/dialogue.html)

— Reply to this email directly, view it on GitHub https://github.com/bhavitvyamalik/DialogTag/issues/6#issuecomment-1069052070, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAMWFZFJSZT3U2NLK7DP3VAHEYFANCNFSM5QZ6HK6Q . You are receiving this because you authored the thread.Message ID: @.***>

angoodkind commented 2 years ago

Further, what kind of model did you use when training? I understand it was a multi-class classification problem, but what was the training process? Thanks!

angoodkind commented 2 years ago

This is similar to a lot of the questions raised in #2

angoodkind commented 2 years ago

Just following up on this. I would like to cite this library in a paper I am publishing. Can you please provide more details, at least with the type of model you used to train the classifier?

bhavitvyamalik commented 2 years ago

Hi @angoodkind, Apologies for the delayed response. As mentioned in my comment previously,

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

The model you used depends on how you called the API model = DialogTag('distilbert-base-uncased'), it calls the model with finetuned weights of model name you provided. Since it was a multi-class classification problem, I used CrossEntropyLoss as my loss function for ground truth intent and predicted intent.