Closed harsh306 closed 5 years ago
@harsh306 doing this way you’ll get subword embeddings so to get word embedding we also need some function over set of embeddings to map it to single word embedding, i can suggest that using max pooling again is probably the best choice because model also uses it and max pooling thus becomes “natural choice” but i d prefer to hear comments from laser developers
LASER was trained to get sentence embeddings. There may be possibilities to get something like a word embedding, but the system was not trained for it and there are probably suboptimal. The output's of the last BiLSTM layer could be qualified as contextualized word embeddings. Actually, these are not at the word, but BPE level, as the input to the LASER encoder. If you are only interested in word embeddings, I would recommend using another approach which was developed and trained for this. There are a couple of choices
What is the best way to get the word embeddings out of LASER instead of Sentence embeddings?