facebookresearch / MUSE

A library for Multilingual Unsupervised or Supervised word Embeddings
Other
3.17k stars 544 forks source link

Create multilingual word embeddings for more than 2 languages #154

Open kinoute opened 4 years ago

kinoute commented 4 years ago

Hi,

What would be the best approach to align multiple (more than 2) monolingual word embeddings into a single vector space? I come from fastText_multilingual which is unfortunately outdated. With this repo you could get 70+ words embeddings aligned into a single vector space.

I also read this article from Facebook talking about how they merged lot of different word embeddings: https://engineering.fb.com/ml-applications/under-the-hood-multilingual-embeddings/

I would like to be able to do the same, not necessary for that much of languages, but at least a dozen. Any tip?

Thanks!