huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.24k stars 221 forks source link

How to use a custom Sentence Transformer pretrained model #192

Open theainerd opened 2 years ago

theainerd commented 2 years ago

Hello team,

Presently we are using models which are present in hugging face . I have a custom trained Sentence transformer. How I can use a custom trained Hugging face model in the present pipeline.

tomaarsen commented 2 years ago

It is as simple as switching out the model name from the following script with the model name on Hugging Face, assuming that the model was pushed to Hugging Face. If the SentenceTransformer is in a local folder, then you can give the path to that folder too. If instead the model is on the Hugging Face hub, but private, then you may provide use_auth_token to authenticate yourself. See the from_pretrained docs from the huggingface_hub documentation for some more information on this from_pretrained call.

from setfit import SetFitModel

model_id = "my_organisation/my_model"
model = SetFitModel.from_pretrained(model_id)

Beyond that, the pipeline is the same as shown in the README.md and the example notebooks.

Furthermore, the text-classification.ipynb notebook may be able to help explain further. The following cell can be modified such that model_id is the model name of your custom HuggingFace on the hub or the path to your local Hugging Face model.

dataset_id = "sst2"
model_id = "sentence-transformers/paraphrase-mpnet-base-v2"
theainerd commented 2 years ago

Thanks for the quick reply. The SentenceTransformer is in a local folder, then you can give the path to that folder too. I am trying to load it from a local folder.

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('path/to/the/folder')

This is the error I am getting when running the trainer.train() command.

AttributeError: 'SentenceTransformer' object has no attribute 'model_head'

This one I can understand there must be something wrong as i am using the sbert and trainer method together.

If I am trying to load it using the huggingface setfit method.

from setfit import SetFitModel
model = SetFitModel.from_pretrained("/path/to/the/folder")

This is the error i am getting:

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': ''/content/drive/MyDrive/Ashwani/CapOne FT/finetuned_models_capone_golden_dataset/bft_capone_fulldataset_golden_dataset_model''. Use `repo_type` argument if needed.

Am i missing any specific parameter to mention here.

tomaarsen commented 2 years ago

I believe the latter code block with SetFitModel.from_pretrained is the correct approach. I see now that others have experienced similar issues:

There was an attempted fix for this, but the tests for that PR (#114) only cover some of the test cases, i.e. not your scenario where your model is likely saved with model.save_pretrained where model is a SentenceTransformer rather than a SetFitModel. Perhaps this bug still exists.

kgourgou commented 1 year ago

@theainerd

You just need another model to play the role of the head. Going directly through the trainer won't work because it expects a SetFitModel instance, not a SentenceTransformer one.

The way I do it is

st = SentenceTransfomer('some_model') # this is your local model or generic sentence transformer
setfit_model = SetFitModel(model_body = st, model_head = LogisticRegression()) # Logistic regression from sklearn

You can attach other model heads as long as they implement standard sklearn methods.

Then you can instantiate a trainer and use .train() as usual.

theainerd commented 1 year ago

Thanks @kgourgou,

The only change if my problem is multiclass classification problem is to add a different head instead of Logistic regression.

theainerd commented 1 year ago

@kgourgou @tomaarsen

i was able to resolve the problem and use the pretrained transformer as you mentioned. But it feels like we are unable to train the head for long enough.

model = SentenceTransformer('/path/to/the/folder') # this is your local model or generic sentence transformer
setfit_model = SetFitModel(model_body = model, model_head = OneVsRestClassifier(LinearSVC(random_state=0)))
trainer.train()

I get the following error and the accuracy is very bad.

/usr/local/lib/python3.7/dist-packages/sklearn/svm/_base.py:1208: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  ConvergenceWarning,

I tried to freeze the model and train the head for longer epochs but i guess that not supported with scikit-learn head.

tomaarsen commented 1 year ago

@theainerd

Perhaps you'll find more luck with increasing the number of iterations for the LinearSVC?

model = SentenceTransformer(
    "/path/to/the/folder"
)
setfit_model = SetFitModel(
    model_body=model,
    model_head=OneVsRestClassifier(LinearSVC(random_state=0, max_iter=500)), # <-- default max_iter is 100
)
trainer.train()
kgourgou commented 1 year ago

I tried to freeze the model and train the head for longer epochs but i guess that not supported with scikit-learn head.

How did you pick the epochs for the model head? 🤔

I may be wrong, but I think all options re: training that you pass to the trainer, e.g., num_epochs, num_iterations, etc., are only about the body and not the head. The head runs .fit() with whatever options it has as default. You can override the defaults the way Tom describes in https://github.com/huggingface/setfit/issues/192#issuecomment-1321748737

theainerd commented 1 year ago

@kgourgou @kgourgou

Thanks for your input . I was able to get improvement when i set max_iter = 500 and also tried with other values. Results improved but the error of Convergence Warning remains the same and also the accuracy vary a lot for different mac_iter values.

I have 20 classes with 16 samples each.

Is it because the svm head is unable to learn or not converge with a custom sentence transformers. I am experimenting on multi-labels as well. Will share some of the findings for that as well.

kgourgou commented 1 year ago

@Bourhano sounds like you want to override the model body, not the model head? The model head is the logistic regression and the body is the sentence transformer (ST).

I could imagine that you could push any model_body as long as it implements the methods required by SetFitModel and SetFitTrainer.

Bourhano commented 1 year ago

@kgourgou Well I actually needed to modify the head, but in order to do so I have to load the SetFitModel using the SetFitModel() constructor and not using the function SetFitModel.from_pretrained(). That is why I am obliged to first load the model_body then the model_head.

Anyway problem solved now, it turns out I already have access to the library sentence_transformers. Thanks anyway.