xlang-ai / instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Apache License 2.0
1.78k stars 131 forks source link

Are these embeddings Contextualized Embeddings ? #103

Open jbdatascience opened 6 months ago

jbdatascience commented 6 months ago

Are these embeddings Contextualized Embeddings ?

Contextualized Embeddings are embeddings such as in Transformers. Contextualized Embeddings are able to generate different vector representations for the different meanings a single word can have (this is called polysemy). For example the word "bank" has a lot of different meanings (such as financial organization or the bank of a river etc. etc.). All these different menaings of "bank" get totally different vectors in Contextualized Embeddings. In NON-Contextualized Embeddings the word bank is represented by a single vector in embedding space, which is of course a very bad representation !

Therefore my question: Are these embeddings Contextualized Embeddings ?

hongjin-su commented 6 months ago

Hi, Thanks a lot for your interest in the INSTRUCTOR!

I guess we need to provide context in order to get contextualized embeddings. For example, if we are only provided with a single word "bank", I think it is hard to guess the meaning it refers to.