Closed radiantone closed 6 years ago
I should add this to our ELMo tutorial: https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md. You might find it easier to just run ELMo programatically, such as in iPython.
@radiantone would you find it helpful if the ELMo command supported other output formats? Which would you prefer?
Yeah, I am trying to use it programmatically. You'll have to forgive my newness to elmo. I see in the tutorial this segment:
embeddings = elmo(character_ids)
embeddings['elmo_representations'] is length two list of tensors.
Each element contains one layer of ELMo representations with shape
(2, 3, 1024).
2 - the batch size
3 - the sequence length of the batch
1024 - the length of each ELMo vector
What I am wanting to do is view the actual text embeddings found vs enumerate the numerical vectors. Perhaps a bit more detail on practical things you can do once you have the embeddings object.
Sorry if this request sounds a bit ignorant.
I think the following would be easiest:
from allennlp.commands.elmo import ElmoEmbedder
elmo = ElmoEmbedder()
vectors = elmo.embed_sentence(["I", "ate", "an", "apple", "for", "breakfast"])
Now you have vectors[0]
, vectors[1]
, and vectors[2]
that represent the ELMo vectors. Each vector is length 6, matching the length of the input sentence.
Yeah. I can get the vectors but I'm curious how to display the human readable words in them.
They are vectors that are representing the words that you passed in. If you embedded the words, you already have access to them.
Ok. I guess I'm wanting a concrete example of what one can do with the vectors. The tutorial doesn't show anything after obtaining vectors.
So I get these vectors. What next? What can I do with them. What information do they contain that I dont already have with the raw sentences? I know they are numbers and represent words but that's not enough to understand what to use them for.
This is probably obvious to career data scientists but for the layman programmer this is not clear in the tutorial.
The fundamental problem of using machine learning on text is deciding how to represent text as features. From the start of statistical NLP until just a few years ago, people wrote feature extractors by hand to represent individual pieces of text. A few years ago we discovered that simple word embeddings were really good feature extractors, and if you used those as raw input to a statistical model of language, instead of hand-written feature extractors, your model would perform much better. ELMo is the next step in feature extraction for text. Instead of getting a single vector for each word in isolation, you run a pre-trained feature extractor on an entire sentence. The resulting vectors are then used as input to some statistical model that tries to predict something about language.
Thank you for that explanation. So I can take the the embedding output file I created and plug it right in to (for example) the demo apps that demonstrate machine comprehension etc?
We've designed AllenNLP so that it's easy to use ELMo as a feature extractor for any existing model. You can see how to do that here. Basically, that uses ELMo as a TokenEmbedder
, as described in this tutorial.
Hi, I was able to get the vectors using the method specified in this issue. Thanks for that! But my question is , there are three vectors generated for every sentence, what does it signify? For the same sentence three different vectors are given. So does the first vector is given as the input to the LSTM model to generate the second vector? If that is the case, does the third vector is the right choice to use in embedding applications?
The publication has more information about the different vectors. The lower vectors represent more contextual information and the higher vectors represent more semantics.
Good info showing up in this thread. However the ticket is requesting that these concepts be amended to a simple tutorial that shows the utility of the embeddings in some kind of practical use case. That anyone can understand.
Thanks for the feedback @radiantone. We're a very small team, however, and our focus is on people who are actively doing research in natural language processing. That's why you don't see tutorials explaining these things. If you'd like to help out and make our tutorials better, PRs would be very welcome!
I am actively improving the ELMo tutorial given your feedback:
Contributions are very welcome.
Soon as I get good at this really cool tech I'll be happy to submit PRs. Just climbing the learning curve still. Thanks for a great set of tools!
Closing, as we've updated the tutorial and have a specific example of how to read data from HDF5 files: https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md#writing-contextual-representations-to-disk
Hi, I am new to AllenNLP. I used the comman line tool to create hdf5 embeddings file from sentences. Following the tutorial. But it stops there and I'm not sure how to view the actual text results. I see the HDF5 file contains HDF5 datasets. I want to see the text word embeddings.
Thank you!