princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.36k stars 507 forks source link

ADD: function to add to search index #166

Closed Uzay-G closed 2 years ago

Uzay-G commented 2 years ago

I implemented this for myself, and made a PR in case anyone found it interesting.

By doing this I'm not setting any expectation it will get merged, do whatever you want with it. It works with and without FAISS.

Example usage:

from simcse import SimCSE
model = SimCSE("princeton-nlp/sup-simcse-bert-base-uncased")
sentences = ['A woman is reading.', 'A man is playing a guitar.']
model.build_index(sentences, use_faiss=False)
results = model.search("panda")
print(results)
sentences_b = ['A woman is making a photo of a panda.']
model.add_to_index(sentences_b)
results = model.search("panda")
print(results)