out-of-vocabulary imputation?

oborchers / Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

GNU General Public License v3.0

618 stars 83 forks source link

out-of-vocabulary imputation? #68

Open KnutJaegersberg opened 2 years ago

KnutJaegersberg commented 2 years ago

Have you considered working on meta embeddings and embedding imputation? I think fse might practically challenge some deep learning architectures, especially when taking knowledge graph embeddings into account.

oborchers commented 2 years ago

Hi @KnutJaegersberg! Can you share some papers with me to consider if this is possible? Many thanks

KnutJaegersberg commented 2 years ago

I read a few papers 2 weeks ago, but I can't recall them. As I get on the topic now, I found some interesting ones:

https://github.com/ikergarcia1996/MetaVec SOTA claim.

Also highly inspiring is the idea of radixAI, which reprojects fasttext embeddings into numberbatch knowledge graph space (solving OOV problem). My intuition is, with graph embeddings we can tackle bias in NLP better. https://radix.ai/blog/2021/3/a-guide-to-building-document-embeddings-part-1/ But they did not publish them.