Closed guoxiangke closed 2 months ago
We are roughly using the following Python code
import sentence_transformers
transformer = sentence_transformers.SentenceTransformer(
"BAAI/bge-m3",
device="cuda",
revision="babcf60cae0a1f438d7ade582983d4ba462303c2",
)
transformer.max_seq_length = 1024
embeddings = transformer.encode(
sentences=[], # batch of inputs
show_progress_bar=False,
normalize_embeddings=True,
convert_to_numpy=True,
convert_to_tensor=False,
)
in this article https://upstash.com/blog/indexing-wikipedia , thanks.