Clear Guide on How to Use Elmo embeddings

allenai / allennlp

An open-source NLP research library, built on PyTorch.

http://www.allennlp.org

Apache License 2.0

11.75k stars 2.25k forks source link

Clear Guide on How to Use Elmo embeddings #1737

Closed andymancodes closed 6 years ago

andymancodes commented 6 years ago

Hi,

Even after trying to work with elmo and reading about it, I am not getting how to use it. It looks like for a given sentence, i have to pass the sentence through the elmo model and then I can get the elmo embeddings? But the parameters of a neural net are fixed after training. So, why not release elmo as a set of pretrained vectors like glove, why make it so hard to use?

Describe the solution you'd like A clear description of how to use elmo embeddings. Does elmo have word embeddings? Does elmo only give sentence embeddings? How can I use elmo, where to download it. How can I build a matrix of word embeddings as in Glove or word2vec?

hzeng-otterai commented 6 years ago

Elmo model is "contextualized" so it needs a whole sentence to calculate embeddings. So it's not possible to provide a simple word to embedding mapping. But you can use the "allennlp elmo" command to preprocess your data and dump those embeddings. For details please check https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md.

DeNeutoy commented 6 years ago

Elmo does have word embeddings, which are built up from character convolutions. However, when Elmo is used in downstream tasks, a contextual representation of each word is used which relies on the other words in the sentence.
Elmo does not produce sentence embeddings, rather it produces embeddings per word "conditioned" on the context. This is why you have to run them through the model.
You cannot "build a matrix of word embeddings" for the reasons above.

Please see the tutorial for a comprehensive overview.

You might get a more positive/prompt response on future issues on open source software if you moderated your tone. You sound unnecessarily accusatory.

mchari commented 4 years ago

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

The link seems to be broken. Pl. provide a valid link. Thanks.

Elmo model is "contextualized" so it needs a whole sentence to calculate embeddings. So it's not possible to provide a simple word to embedding mapping. But you can use the "allennlp elmo" command to preprocess your data and dump those embeddings. For details please check https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md.

bpben commented 4 years ago

Looks like the tutorial has been moved to this: https://guide.allennlp.org/representing-text-as-features#6

However, this is kind of everything in one chunk. It'd be useful to have the tutorial split up into different methods (e.g. ELMo, BERT).

The tutorial also outlines a pretty onerous way of getting embeddings. It seems like this code works, but maybe I'm misunderstanding something:

from allennlp.commands.elmo import ElmoEmbedder
elmo = ElmoEmbedder()
docs = ["Let's stick to the script",
       "I threw the stick to the dog",
       "We should stick together"]

# elmo embedder expects a list of tokens
token_docs = [d.split() for d in docs]
elmo_vecs = [elmo.embed_sentence(d) for d in token_docs]
elmo_vecs[0].shape

matt-gardner commented 4 years ago

@bpben, if that usage suits your purposes, then go for it. If you use that method of getting ELMo embeddings, though, you'll have quite a hard time switching your code to use BERT, or whatever comes afterward. If you don't care about that, then there are simpler things that you can do, such as what you listed.

bpben commented 4 years ago

Actually, I stand corrected: It seems like allennlp version 1.0 doesn't have allennlp.commands.elmo. I'm a bit confused why that was removed in favor of the method outlined by the tutorial. Maybe this is a separate issue.

matt-gardner commented 4 years ago

There have been other issues that brought this up, yes (and in general, opening new issues is much easier for us to keep track of than commenting on old ones). We thought that this particular use case was not a common one, so we didn't want to continue supporting that code. The method in the tutorial is how the library was designed to be used from the beginning for writing research code, while dumping vectors is a relatively rare use case.

matt-gardner commented 4 years ago

If you want to continue using that method, it should be pretty simple for you to copy the old elmo command code into your own repo. It should still work.