UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.1k stars 2.46k forks source link

Question - Word Level Embeddings #968

Closed varunSabnis closed 3 years ago

varunSabnis commented 3 years ago

Can we use SBERT to learn meaningful representations at the "word level" while we fine-tune the model with sentence pairs from any specific domain? This way words in the sentences that are semantically related are closer in the embedding space than those that are not. Thanks in advance!

nreimers commented 3 years ago

Hi @varunSabnis Yes, BERT contextualized word embedings can be used for that: https://www.aclweb.org/anthology/2020.semeval-1.3.pdf

SBERT can return the token embeddings. I have no experiences in this task, so I cannot tell if models will work out of the box or if you need to tune them first.

varunSabnis commented 3 years ago

@nreimers Thank you very much for the quick response. Will go through the shared paper.