tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture
Apache License 2.0
5 stars 2 forks source link

Use PCA and display embedding for few identique words but in 4 differents languages #16

Open tarrade opened 4 years ago

vluechinger commented 4 years ago

First attempts to do this have been done in notebook 10_Embeddings.

Although, we came to the conclusion that embedding single words only makes sense for "traditional" methods like word2vec and other vector representations that do not depend on the whole sentence. For BERT embeddings, we should implement a sentence representation which would make more sense. More information: https://www.quora.com/What-are-the-main-differences-between-the-word-embeddings-of-ELMo-BERT-Word2vec-and-GloVe Also: the links regarding embeddings in confluence

To Do: sentence representation in different languages