huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.24k stars 221 forks source link

Loading a trained SetFit model without setfit? #282

Closed ZQ-Dev8 closed 9 months ago

ZQ-Dev8 commented 1 year ago

SetFit team, first off, thanks for the awesome library!

I'm running into trouble trying to load and run inference on a trained SetFit model without using SetFitModel.from_pretrained(). Instead, I'd like to load the model using torch, transformers, sentence_transformers, or some combination thereof. Is there a clear-cut example anywhere of how to do this?

Here's my current code, which does not return clean predictions. Thank you in advance for the help. For reference, this was trained as a multiclass classification model with 18 potential classes:


from transformers import AutoTokenizer, AutoModel
import torch
import torch.nn.functional as F

tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
inputs = ['xxx', 'yyy', 'zzz']
encoded_inputs = tokenizer(inputs,
                           padding = True,
                           truncation = True,
                           return_tensors = 'pt')
model = AutoModel.from_pretrained('/path/to/trained/setfit/model/')
with torch.no_grad():
    preds = model(**encoded_inputs)
preds```
tomaarsen commented 1 year ago

Hello!

I suspect that this is technically possible, although not particularly convenient. For reference, a SetFit model consists of two parts: a SentenceTransformer body, and a torch or sklearn classifier head. I suspect that loading a model without SetFit likely involves performing AutoModel.from_pretrained("/path/to/trained/setfit/model/"), after which you likely will have only loaded the model that was finetuned by the sentence transformer body. You'll probably have to replace that model's classifier with torch.nn.Sequential() to get the body section of SetFit to only produce embeddings rather than predictions. Afterwards, you'll have to somehow load the classifier head which should also be saved in the SetFit model directory or repository, which can convert your embeddings into the correct predictions.

That all said, I am struggling to understand why you cannot simply use SetFitModel.from_pretrained()? It is designed specifically to avoid having to do all of this work manually. Perhaps if you elaborate on your reasoning, we'll be able to help you better.

ZQ-Dev8 commented 1 year ago

@tomaarsen thanks for your reply, and apologies for the delayed response. My question is specifically for instances where, for example, you train a model using SetFit and then share it with another person/organization who, for whatever reason, cannot easily install SetFit on their machine, but already has torch & transformers installed.

Since SetFit models are torch models under the hood, I figured there would be a simple way to load them and run inference using more general libraries?

complycontrols commented 1 year ago

@dcruiz01 were you able to find a way around ?

ZQ-Dev8 commented 1 year ago

@complycontrols unfortunately no, though I haven't had much time to look into it. It must be possible, and I get the feeling the solutions I've tried are overly complicated.

maryc-sullivan commented 1 year ago

@dcruiz01 I was able to achieve this by 1) using a PyTorch only model as the classification final layer and then 2) using the transformers library to import the model.

I had to find a work around in order to make SetFit compatible with PySpark/SparkNLP (which requires transformers models be imported with TensorFlow).

from transformers import RobertaTokenizer, TFRobertaForSequenceClassification, RobertaConfig

MODEL_NAME = '<HuggingFace Location of SetFit Model>'

tokenizer = RobertaTokenizer.from_pretrained(MODEL_NAME)
config = RobertaConfig.from_pretrained(MODEL_NAME, label2id=label2id, id2label=id2label)
model = TFRobertaForSequenceClassification.from_pretrained(MODEL_NAME, config=config, from_pt=True)

inputs = tokenizer("I love my dog!", return_tensors="tf")
logits = model(**inputs).logits
predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
model.config.id2label[predicted_class_id]
n-splv commented 6 months ago

@maryc-sullivan's code doesn't use the model head that was pre-trained with SetFit, which, I believe was the author's question. Saving a SetFit model to be fully compatible with Transformers API would definitely require some tinkering. It's sad that the authors didn't consider the benefits of it. The most obvious reason is that, with all its greatness, SetFit will never be able to catch up to all the changes in such a huge and rapidly evolving project as Transformers, so as a developer you will ether lack some new features or be forced to solve compatibility problems on your own.