UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.1k stars 2.46k forks source link

Elastic Search example doubt. #442

Open Adarsh1999 opened 4 years ago

Adarsh1999 commented 4 years ago

I want to index data from different object detection libraries output like this output={ "objects": [ "bottle. ", "person", ], "score": [ 0.6211097240447998, 0.42280933260917664 ], "file_name":"trial.jpg" } in the elastic search which will include objects name and its score being indexed so that when I search some keywords can retrieve related images from those keywords.

So, how can I get this thing working.

nreimers commented 4 years ago

If you only have key words, I recommend traditional word embedding approaches like Glove or word2vec (also available in this framework). contextualized word embeddings like BERT to not have any advantage for single words or short phrases. They even perform quite bad for this.

Then, for each object, index the glove embedding in elasticsearch. You can then use cosine similarity and fetch the most similar words / entries.