I've extracted ELMO embeddings for personality traits, computed pairwise cosine similarity, performed multidimensional scaling, and then visualized the result:
As you can see, the results don't make much sense. For example, with other embeddings (e.g., word2vec, paragram-sl999), you'll at least get positive traits on one side and negative traits on the other. I don't see much rhyme or reason in the above plot.
I get better results if I get vectors for the above traits by putting each of them in a 'sentence' with the word 'trait'. And I also get decent results if I use Allen NLP's elmo implementation even when not contextualizing the trait words.
I've also tried regressing human judgments about masculinity and femininity directly on the embeddings, and I get pretty much random predictions, whereas using other vectors (again, word2vec, paragram) or the getting ELMO vectors contextualized by the word 'trait' predicts the human judgments pretty well.
I'm using this model:
http://magnitude.plasticity.ai/elmo/medium/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.magnitude
I've extracted ELMO embeddings for personality traits, computed pairwise cosine similarity, performed multidimensional scaling, and then visualized the result:
As you can see, the results don't make much sense. For example, with other embeddings (e.g., word2vec, paragram-sl999), you'll at least get positive traits on one side and negative traits on the other. I don't see much rhyme or reason in the above plot.
I get better results if I get vectors for the above traits by putting each of them in a 'sentence' with the word 'trait'. And I also get decent results if I use Allen NLP's elmo implementation even when not contextualizing the trait words.
I've also tried regressing human judgments about masculinity and femininity directly on the embeddings, and I get pretty much random predictions, whereas using other vectors (again, word2vec, paragram) or the getting ELMO vectors contextualized by the word 'trait' predicts the human judgments pretty well.