huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.23k stars 220 forks source link

How I can train setfit model on my Apple M1 Mac GPU? #325

Closed karndeepsingh closed 1 year ago

karndeepsingh commented 1 year ago

Hi, I have been trying to train setfit model on Apple M1 Mac but I guess it is using CPU to train. I have installed pytorch dependencies for M1 GPU to work but I can't see that in working with Setfit.

Can you please help to make train the setfit model on Apple M1 mac GPU ?

Also,If I train on NVIDIA GPU do I need to specify any device parameter set to "cuda" before training setfit model or it is automatically handled?

tomaarsen commented 1 year ago

Hello! It is my understanding that a model is automatically placed on GPU if that is available. However, you can also use model.to("cuda") to be sure that it's correct.

model = SetFitModel.from_pretrained("...")
model.to("cuda")
trainer = SetFitTrainer(
    model=model,
    ...
)

One way to verify whether the model is indeed on GPU is through

print(model.model_body)

And you can verify that torch is indeed compiled with CUDA through:

import torch
print(torch.cuda.is_available())

Hope this helps a bit,

Tom Aarsen

karndeepsingh commented 1 year ago

Hello! It is my understanding that a model is automatically placed on GPU if that is available. However, you can also use model.to("cuda") to be sure that it's correct.

model = SetFitModel.from_pretrained("...")
model.to("cuda")
trainer = SetFitTrainer(
    model=model,
    ...
)

One way to verify whether the model is indeed on GPU is through

print(model.model_body)

And you can verify that torch is indeed compiled with CUDA through:

import torch
print(torch.cuda.is_available())

Hope this helps a bit,

Tom Aarsen

Thanks @tomaarsen
This would work in NVIDIA GPU with CUDA enabled. How I can transfer the model in M1 Mac to be used by it M1 GPU?

tomaarsen commented 1 year ago

Oops, my apologies, I believe you would need to use model.to("mps").

karndeepsingh commented 1 year ago

Oops, my apologies, I believe you would need to use model.to("mps").

Thanks @tomaarsen

I have following doubt.

  1. How we can plot confusion matrix or classification report and score for F1, RECALL and precision?
  2. How I can get confidence score for the predictions?

Please help me on this.

Thanks

tomaarsen commented 1 year ago

You can evaluate a SetFitModel using SetFitTrainer.evaluate(). This uses the metric provided to the SetFitTrainer, which is either an evaluate metric like "f1":

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    metric="f1",
)

or a function called with two arguments (y_pred, y_test). You can write any function that you'd like here, i.e. one that uses sklearn.metrics.confusion_matrix.

Example script ```python from datasets import load_dataset from sentence_transformers.losses import CosineSimilarityLoss from setfit import SetFitModel, SetFitTrainer, sample_dataset from sklearn.metrics import confusion_matrix # Load a dataset from the Hugging Face Hub dataset = load_dataset("sst2") # Simulate the few-shot regime by sampling 8 examples per class train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8) eval_dataset = dataset["validation"] # Load a SetFit model from Hub model = SetFitModel.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2") def confusion_matrix_metric(y_pred, y_test) -> None: matrix = confusion_matrix(y_test, y_pred) print(matrix) # Create trainer trainer = SetFitTrainer( model=model, train_dataset=train_dataset, eval_dataset=eval_dataset, metric=confusion_matrix_metric, column_mapping={"sentence": "text", "label": "label"}, # Map dataset columns to text/label expected by trainer ) # Train and evaluate trainer.train() metrics = trainer.evaluate() ``` Which produces ``` Applying column mapping to training dataset Generating Training Pairs: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 9210.15it/s] ***** Running training ***** Num examples = 640 Num epochs = 1 Total optimization steps = 40 Total train batch size = 16 Iteration: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:04<00:00, 9.07it/s] Epoch: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.41s/it] Applying column mapping to evaluation dataset ***** Running evaluation ***** [[368 60] [ 48 396]] ```

As for 2, we you can use model.predict_proba to get the probabilities for each of the classes:

>>> model.predict_proba(["This movie was amazing", "That film was bad!"])
tensor([[0.0414, 0.9586],
        [0.9369, 0.0631]], dtype=torch.float64)
karndeepsingh commented 1 year ago

You can evaluate a SetFitModel using SetFitTrainer.evaluate(). This uses the metric provided to the SetFitTrainer, which is either an evaluate metric like "f1":

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    metric="f1",
)

or a function called with two arguments (y_pred, y_test). You can write any function that you'd like here, i.e. one that uses sklearn.metrics.confusion_matrix.

Example script As for 2, we you can use model.predict_proba to get the probabilities for each of the classes:

>>> model.predict_proba(["This movie was amazing", "That film was bad!"])
tensor([[0.0414, 0.9586],
        [0.9369, 0.0631]], dtype=torch.float64)
  • Tom Aarsen

Thankyou so much @tomaarsen

Glad to help! I'll close this for now, as I think this should be solved!

Just a quick question @tomaarsen :

Is there a way to retrain the model from the previous checkpoint? For example: Initially I trained classification model with 5 categories and trained it for 10 epochs and saved the model. In future I want to include 2 new categories with previous 5 categories (i.e total 7 categories) so, how I can update my previous model with these new categories. Do I need to trained the model from scratch with all 7 categories with all the previous data that was used to train 5 categories with new data ?

tomaarsen commented 1 year ago

Glad to help! I'll close this for now, as I think this should be solved!