ohmeow / blurr

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.
https://ohmeow.github.io/blurr
Apache License 2.0
289 stars 34 forks source link

Multi-Label Classification Problem #83

Open satish860 opened 1 year ago

satish860 commented 1 year ago

Hello, I'm new to DL and have just begun the Fastai course for 2022.

I'm working on a Multi-label classification problem and downloaded the Emotions Dataset from Hugging. As shown below, this Dataset used an Array to show the Multi-label. image

I've looked at the example in the repository. You have changed the data to this One-hot encoding value. image

and then used the Colreader to get the Y variable.

So, my question is: Do I have to use the same structure for my problem as well, or is there a different one I can use?

ohmeow commented 1 year ago

Yah, if its multi-label, you need to:

  1. OHE the targets as 1 or 0
  2. When you create the HF tokenizer, tell it the number of classes that need to be predicted
  3. Use BinaryCrossEntropyLoss as your loss function

I have an example here: https://ohmeow.github.io/blurr/text-examples-multilabel.html