huggingface / setfit

Efficient few-shot learning with Sentence Transformers

https://hf.co/docs/setfit

Apache License 2.0

2.24k stars 223 forks source link

Add files via upload #497

Closed MosheWasserb closed 8 months ago

MosheWasserb commented 8 months ago

Adding a new notebook demonstrates Zero cost Zero time Zero shot Financial Sentiment Analysis From GPT4/Mixtral to MLP128K with SetFit @tomaarsen Could you also send to Moritz Laurer for review?

review-notebook-app[bot] commented 8 months ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

HuggingFaceDocBuilderDev commented 8 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tomaarsen commented 8 months ago

Looks promising at a glance! cc @MoritzLaurer

MoritzLaurer commented 8 months ago

Looks interesting and good to me. Would assume that this works less well for more complex tasks and the additional step of distilling from the setfit model takes more developer time, but overall a cool approach for further compressing the model and making things much more efficient for inference

MosheWasserb commented 8 months ago

Thanks @MoritzLaurer for the comments. We updated the notebook accordingly. Would you be interested in a joint post/blog or expanding your original blog with the MLP example?

MoritzLaurer commented 8 months ago

Thanks @MoritzLaurer for the comments. We updated the notebook accordingly. Would you be interested in a joint post/blog or expanding your original blog with the MLP example?

Great! Don't have bandwidth for a joint blog atm unfortunately. Notebook LGTM @tomaarsen

MosheWasserb commented 8 months ago

Hi @tomaarsen I think we are good to go and merge into main. Would be great if you could also promote via LinkedIn.

MosheWasserb commented 8 months ago

Hi @tomaarsen Could you merge into main?

tomaarsen commented 8 months ago

@MosheWasserb My apologies for the radio silence here, I was very busy with https://github.com/UKPLab/sentence-transformers/releases/tag/v2.6.0 Very impressive performance on this work.

MosheWasserb commented 8 months ago

Hi @tomaarsen Sure, no problem. Great work with the binary embeddings. Did you know that for SetFit I was able to compress a 768-vector size into 2 dim with no accuracy loss?

model after fine-tuning

X_train = model.encode(x_train) X_eval = model.encode(x_eval)

PCA

estimator = PCA(n_components=2) estimator.fit(X_train)

2D vectors

X_train_em = estimator.transform(X_train) X_eval_em = estimator.transform(X_eval)

Logistic 2nd phase

sgd = LogisticRegression() sgd.fit(X_train_em, y_train) y_pred_eval_sgd = sgd.predict(X_eval_em)

tomaarsen commented 8 months ago

Thank you! PCA remains strong indeed, especially for classification. It doesn't work very well for retrieval however, there I've had more luck with 1. Matryoshka models and 2. Quantization to speed up the comparisons.