Open szha opened 5 years ago
tldr: I think this is quite critical to do in order to simply make sure the pre-trained embeddings we provide are actually reproducing the paper's result. Currently the ELMo embeddings seem to not be usable out-of-the-box in a satisfying manner.
I've add very mixed result in using directly the pre-trained ELMo embeddings for sentence embeddings. I used the tutorial http://gluon-nlp.mxnet.io/examples/sentence_embedding/elmo_sentence_representation.html to extract the contextualized word embeddings and used several methods to get the the overall sentence embeddings:
I used the cosine similarity for comparing the resulting sentence embeddings and the results are not very good:
for example:
'The name hippopotamus comes from the ancient Greek word hippopotamos, which means river horse',
and
President Donald Trump last week intended to reverse sanctions imposed on two Chinese shipping companies accused of violating North Korea trade prohibitions
are closer than
'The name hippopotamus comes from the ancient Greek word hippopotamos, which means river horse',
and
'Hippos rank as one of the largest animals in Africa and are not known for their sunny dispositions, causing more human deaths in Africa annually than lions, leopards, crocodiles, or any other of the major predators',
0.95187436 vs 0.92387373, which are also fairly higher values than I would expect.
The poor quality of these results has been corroborated by @la-cruche https://discuss.mxnet.io/t/understand-gluonnlp-elmo-output-shape/3969/9
When doing the same technique with TF embeddings he has claimed to have a lot better results.
PCA'ed embeddings from TF-Hub ELMo vs Gluon-NLP ELMo on sentences of 2 different articles:
@cgraywang do you know what could be the problem?
The results on the following tasks are reported in the ELMo paper (https://allennlp.org/elmo)