qnguyen3 / chat-with-mlx

An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
https://twitter.com/stablequan
MIT License
1.49k stars 134 forks source link

⬆️ Bump sentence-transformers from 2.5.1 to 2.6.0 #75

Closed dependabot[bot] closed 8 months ago

dependabot[bot] commented 8 months ago

Bumps sentence-transformers from 2.5.1 to 2.6.0.

Release notes

Sourced from sentence-transformers's releases.

v2.6.0 - Embedding Quantization, GISTEmbedLoss

This release brings embedding quantization: a way to heavily speed up retrieval & other tasks, and a new powerful loss function: GISTEmbedLoss.

Install this version with

pip install sentence-transformers==2.6.0

Embedding Quantization

Embeddings may be challenging to scale up, which leads to expensive solutions and high latencies. However, there is a new approach to counter this problem; it entails reducing the size of each of the individual values in the embedding: Quantization. Experiments on quantization have shown that we can maintain a large amount of performance while significantly speeding up computation and saving on memory, storage, and costs.

To be specific, using binary quantization may result in retaining 96% of the retrieval performance, while speeding up retrieval by 25x and saving on memory & disk space with 32x. Do not underestimate this approach! Read more about Embedding Quantization in our extensive blogpost.

Binary and Scalar Quantization

Two forms of quantization exist at this time: binary and scalar (int8). These quantize embedding values from float32 into binary and int8, respectively. For Binary quantization, you can use the following snippet:

from sentence_transformers import SentenceTransformer
from sentence_transformers.quantization import quantize_embeddings

1. Load an embedding model

model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")

2a. Encode some text using "binary" quantization

binary_embeddings = model.encode( ["I am driving to the lake.", "It is a beautiful day."], precision="binary", )

2b. or, encode some text without quantization & apply quantization afterwards

embeddings = model.encode(["I am driving to the lake.", "It is a beautiful day."]) binary_embeddings = quantize_embeddings(embeddings, precision="binary")

References:

GISTEmbedLoss

GISTEmbedLoss, as introduced in Solatorio (2024), is a guided variant of the more standard in-batch negatives (MultipleNegativesRankingLoss) loss. Both loss functions are provided with a list of (anchor, positive) pairs, but while MultipleNegativesRankingLoss uses anchor_i and positive_i as positive pair and all positive_j with i != j as negative pairs, GISTEmbedLoss uses a second model to guide the in-batch negative sample selection.

This can be very useful, because it is plausible that anchor_i and positive_j are actually quite semantically similar. In this case, GISTEmbedLoss would not consider them a negative pair, while MultipleNegativesRankingLoss would. When finetuning MPNet-base on the AllNLI dataset, these are the Spearman correlation based on cosine similarity using the STS Benchmark dev set (higher is better):

312039399-ef5d4042-a739-41f6-a6ca-eddc7f901411 The blue line is MultipleNegativesRankingLoss, whereas the grey line is GISTEmbedLoss with the small all-MiniLM-L6-v2 as the guide model. Note that all-MiniLM-L6-v2 by itself does not reach 88 Spearman correlation on this dataset, so this is really the effect of two models (mpnet-base and all-MiniLM-L6-v2) reaching a performance that they could not reach separately.

All changes

... (truncated)

Commits
  • a5f7749 Release v2.6.0
  • 13a9f3f [feat] Add binary & scalar embedding quantization support to Sentence Trans...
  • e6af66f Also update return docstring of encode_multi_process (#2548)
  • caaa28d Fix SentenceTransformer encode documentation return type default (numpy vecto...
  • 87f4180 [deprecation] Deprecate save_to_hub in favor of push_to_hub; add safe_s...
  • fc2a2d8 Enable saving modules as pytorch_model.bin (#2542)
  • b9255d9 Add 'get_config_dict' method to GISTEmbedLoss for better model cards (#2543)
  • 465d4f0 Add GISTEmbedLoss (#2535)
  • See full diff in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
dependabot[bot] commented 8 months ago

Superseded by #78.