ohmeow / blurr

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.
Apache License 2.0
289 stars 34 forks source link

Multi-Label Classification Problem #83

Open satish860 opened 1 year ago

satish860 commented 1 year ago

Hello, I'm new to DL and have just begun the Fastai course for 2022.

I'm working on a Multi-label classification problem and downloaded the Emotions Dataset from Hugging. As shown below, this Dataset used an Array to show the Multi-label. image

I've looked at the example in the repository. You have changed the data to this One-hot encoding value. image

and then used the Colreader to get the Y variable.

So, my question is: Do I have to use the same structure for my problem as well, or is there a different one I can use?

ohmeow commented 1 year ago

Yah, if its multi-label, you need to:

  1. OHE the targets as 1 or 0
  2. When you create the HF tokenizer, tell it the number of classes that need to be predicted
  3. Use BinaryCrossEntropyLoss as your loss function

I have an example here: https://ohmeow.github.io/blurr/text-examples-multilabel.html