Closed arrmansa closed 2 years ago
Fixes the issue described here, where using normalize_embeddings=True returned an incorrect result.
normalize_embeddings=True
embeddings = np.linalg.norm(embeddings, ord=2, axis=1, keepdims=False)
was changed to
norms = np.linalg.norm(embeddings, ord=2, axis=1, keepdims=True) embeddings = embeddings/np.where(norms<1e-12, 1e-12, norms)
so that the output is same as the original SentenceTransformer.py.
from fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer encoder = SentenceTransformer("all-MiniLM-L6-v2", device="cpu", quantize=False) print(encoder.encode("Hello hello, hey, hello hello", normalize_embeddings=True).shape)
output before this PR
()
output after this PR
(384,)
Pull Request Overview
Fixes the issue described here, where using
normalize_embeddings=True
returned an incorrect result.Brief
was changed to
so that the output is same as the original SentenceTransformer.py.
Minimum reproducible example
output before this PR
output after this PR