huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.23k stars 220 forks source link

SetFit for Multilabel Text Classification fails to run #101

Closed hussainnawab closed 2 years ago

hussainnawab commented 2 years ago

SetFit for Multilabel Text Classification fails to run and throws an error when the code trainer.train() is executed.

Error thrown:

IndexError                                Traceback (most recent call last)
Cell In [22], line 1
----> 1 trainer.train()

File .../site-packages/setfit/trainer.py:161, in SetFitTrainer.train(self)
    158 train_examples = []
    160 for _ in range(self.num_iterations):
--> 161     train_examples = sentence_pairs_generation(np.array(x_train), np.array(y_train), train_examples)
    163 train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=self.batch_size)
    164 train_loss = self.loss_class(self.model.model_body)

File .../site-packages/setfit/modeling.py:214, in sentence_pairs_generation(sentences, labels, pairs)
    212 current_sentence = sentences[first_idx]
    213 label = labels[first_idx]
--> 214 second_idx = np.random.choice(idx[np.where(num_classes == label)[0][0]])
    215 positive_sentence = sentences[second_idx]
    216 # Prepare a positive pair and update the sentences and labels
    217 # lists, respectively

IndexError: index 0 is out of bounds for axis 0 with size 0
vijin-freelancing commented 2 years ago

i was getting same error while runnning hyperparameter search for multilabel classification. @hussainnawab how did you solve the same? Thanks in advance

hussainnawab commented 2 years ago

@vijin-freelancing Check your setfit version. In my case I was using an older version of the library than what was used in the tutorial

micycle1 commented 1 year ago

I get this error on 0.6.0, when I DONT specify a multi_target_strategy.