upstash / wikipedia-semantic-search

Semantic Search on Wikipedia with Upstash Vector
https://wikipedia-semantic-search.vercel.app/
MIT License
424 stars 35 forks source link

Can you provide the code for embed with SentenceTransformers . thanks. #12

Closed guoxiangke closed 2 months ago

guoxiangke commented 2 months ago

in this article https://upstash.com/blog/indexing-wikipedia , thanks.

mdumandag commented 2 months ago

We are roughly using the following Python code

import sentence_transformers

transformer = sentence_transformers.SentenceTransformer(
    "BAAI/bge-m3",
    device="cuda",
    revision="babcf60cae0a1f438d7ade582983d4ba462303c2",
)
transformer.max_seq_length = 1024

embeddings = transformer.encode(
    sentences=[], # batch of inputs
    show_progress_bar=False,
    normalize_embeddings=True,
    convert_to_numpy=True,
    convert_to_tensor=False,
)