Support for multilingual embeddings?

YaleDHLab / wordmap

Visualize large text collections with WebGL

https://lab-apps.s3-us-west-2.amazonaws.com/wordmap/index.html

MIT License

25 stars 5 forks source link

Support for multilingual embeddings? #7

Closed broadwell closed 4 years ago

broadwell commented 4 years ago

This would require quite a bit of build-out, but would be really cool. https://github.com/artetxem/vecmap includes a few different ways of purportedly cross-mapping embeddings.

duhaime commented 4 years ago

I like that idea alot. I've also been thinking about an embedding space that encodes text and images. Imagine if one took wikipedia articles and for each tried to predict an imagenet category. Then one could train a single space that could encode text and images...

broadwell commented 4 years ago

Yeah that would be cool to do, and most importantly, I think it would look really cool. I've also obtained good results by generating a merged word embedding model with the repo above (though it involved some manual editing of the .w2v files) and then loading the aligned model via the --model option.