Open theainerd opened 2 years ago
It is as simple as switching out the model name from the following script with the model name on Hugging Face, assuming that the model was pushed to Hugging Face. If the SentenceTransformer is in a local folder, then you can give the path to that folder too. If instead the model is on the Hugging Face hub, but private, then you may provide use_auth_token
to authenticate yourself.
See the from_pretrained
docs from the huggingface_hub
documentation for some more information on this from_pretrained
call.
from setfit import SetFitModel
model_id = "my_organisation/my_model"
model = SetFitModel.from_pretrained(model_id)
Beyond that, the pipeline is the same as shown in the README.md and the example notebooks.
Furthermore, the text-classification.ipynb notebook may be able to help explain further.
The following cell can be modified such that model_id
is the model name of your custom HuggingFace on the hub or the path to your local Hugging Face model.
dataset_id = "sst2"
model_id = "sentence-transformers/paraphrase-mpnet-base-v2"
Thanks for the quick reply. The SentenceTransformer is in a local folder, then you can give the path to that folder too. I am trying to load it from a local folder.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('path/to/the/folder')
This is the error I am getting when running the trainer.train()
command.
AttributeError: 'SentenceTransformer' object has no attribute 'model_head'
This one I can understand there must be something wrong as i am using the sbert and trainer method together.
If I am trying to load it using the huggingface setfit method.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("/path/to/the/folder")
This is the error i am getting:
HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': ''/content/drive/MyDrive/Ashwani/CapOne FT/finetuned_models_capone_golden_dataset/bft_capone_fulldataset_golden_dataset_model''. Use `repo_type` argument if needed.
Am i missing any specific parameter to mention here.
I believe the latter code block with SetFitModel.from_pretrained
is the correct approach. I see now that others have experienced similar issues:
There was an attempted fix for this, but the tests for that PR (#114) only cover some of the test cases, i.e. not your scenario where your model is likely saved with model.save_pretrained
where model
is a SentenceTransformer
rather than a SetFitModel
. Perhaps this bug still exists.
@theainerd
You just need another model to play the role of the head. Going directly through the trainer won't work because it expects a SetFitModel
instance, not a SentenceTransformer
one.
The way I do it is
st = SentenceTransfomer('some_model') # this is your local model or generic sentence transformer
setfit_model = SetFitModel(model_body = st, model_head = LogisticRegression()) # Logistic regression from sklearn
You can attach other model heads as long as they implement standard sklearn methods.
Then you can instantiate a trainer and use .train()
as usual.
Thanks @kgourgou,
The only change if my problem is multiclass classification problem is to add a different head instead of Logistic regression.
@kgourgou @tomaarsen
i was able to resolve the problem and use the pretrained transformer as you mentioned. But it feels like we are unable to train the head for long enough.
model = SentenceTransformer('/path/to/the/folder') # this is your local model or generic sentence transformer
setfit_model = SetFitModel(model_body = model, model_head = OneVsRestClassifier(LinearSVC(random_state=0)))
trainer.train()
I get the following error and the accuracy is very bad.
/usr/local/lib/python3.7/dist-packages/sklearn/svm/_base.py:1208: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
ConvergenceWarning,
I tried to freeze the model and train the head for longer epochs but i guess that not supported with scikit-learn head.
@theainerd
Perhaps you'll find more luck with increasing the number of iterations for the LinearSVC?
model = SentenceTransformer(
"/path/to/the/folder"
)
setfit_model = SetFitModel(
model_body=model,
model_head=OneVsRestClassifier(LinearSVC(random_state=0, max_iter=500)), # <-- default max_iter is 100
)
trainer.train()
I tried to freeze the model and train the head for longer epochs but i guess that not supported with scikit-learn head.
How did you pick the epochs for the model head? 🤔
I may be wrong, but I think all options re: training that you pass to the trainer, e.g., num_epochs, num_iterations, etc., are only about the body and not the head. The head runs .fit()
with whatever options it has as default. You can override the defaults the way Tom describes in https://github.com/huggingface/setfit/issues/192#issuecomment-1321748737
@kgourgou @kgourgou
Thanks for your input . I was able to get improvement when i set max_iter = 500
and also tried with other values.
Results improved but the error of Convergence Warning remains the same and also the accuracy vary a lot for different mac_iter values.
I have 20 classes with 16 samples each.
Is it because the svm head is unable to learn or not converge with a custom sentence transformers. I am experimenting on multi-labels as well. Will share some of the findings for that as well.
@Bourhano sounds like you want to override the model body, not the model head? The model head is the logistic regression and the body is the sentence transformer (ST).
I could imagine that you could push any model_body as long as it implements the methods required by SetFitModel and SetFitTrainer.
@kgourgou Well I actually needed to modify the head, but in order to do so I have to load the SetFitModel using the SetFitModel() constructor and not using the function SetFitModel.from_pretrained(). That is why I am obliged to first load the model_body
then the model_head
.
Anyway problem solved now, it turns out I already have access to the library sentence_transformers. Thanks anyway.
Hello team,
Presently we are using models which are present in hugging face . I have a custom trained Sentence transformer. How I can use a custom trained Hugging face model in the present pipeline.