PET for multi label text classification

timoschick / pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

https://arxiv.org/abs/2001.07676

Apache License 2.0

1.62k stars 283 forks source link

PET for multi label text classification #13

Closed ankushkundaliya closed 3 years ago

ankushkundaliya commented 3 years ago

Hi, Congratulations on this great work. I have a question about how to use PET for the multi-label classification task. Do I need to change the loss function defined for the current training?

timoschick commented 3 years ago

Hi @ankushkundaliya , thanks! Currently, PET does not support multi-label classification. You could try to rephrase the task as multiple single-label classifications. For example, let's say you want to know which topics from a given list (let's say, ["politics", "sports", "economics"]) a text x covers. You could reformulate this as three binary classification tasks, with PVPs like

P(x) = x. Is this text about politics? [MASK] for the first task
P(x) = x. Is this text about sports? [MASK] for the second task
P(x) = x. Is this text about economics? [MASK] for the third task

and for all tasks, you could use a verbalizer that maps a match (i.e., the text is about the given topic) to Yes (or True) and a no-match to No (or False).

NielsRogge commented 3 years ago

Hi @timoschick,

I am considering using the approach you explained above to turn a multi-label classification problem (with N labels) into N binary classification tasks. However, I wonder whether there's a big difference between using a single MLM to be fine-tuned rather than different MLMs for each pattern.

Because for example suppose that you have a very large amount of labels (such as 500), fine-tuning 500 MLMs is simply not feasible. Is using a single MLM appropriate?

Kind regards,

Niels

timoschick commented 3 years ago

Hi @NielsRogge, we've run some experiments to answer that very question (I can't go into details as the corresponding paper is currently under review), and the short answer is: Yes, you can use a single MLM. However, you'll have to modify the source code of this library accordingly, as it does not support using a single MLM out-of-the-box.