Closed ajfisch closed 6 years ago
This is in fact exactly what the TokenIndexers
and TextFieldEmbedders
were designed to do. You can see how to configure this here:
and here:
Thanks for the pointer!
Correct me if I'm wrong, but from the code it seems like the embedding
will see just one set of inputs (tokens
) as does the character_encoding
(token_characters
). They will independently compute 100d
character and token embeddings, which are concatenated and fed to the 200d
phrase lstm, in this case.
Using the different concatenated embedding
and character_encoding
downstream is super clear from the tutorial -- I was wondering if you could write a generic text_field_embedder
type that took in both character and token ids to produce embeddings. For example, if a generic embedder architecture like ELMO took in more than just character information.
It seems like no, at least when using the BasicTextFieldEmbedder forward, which only takes in one tensor (which doesn't seem possible to be a tuple).
I don't understand what you're asking. The current code for BiDAF uses a single TextFieldEmbedder
to compute word representations that are a concatenation of word ids and character-level encodings. The code that does this is just:
This does as you say; it "takes in both character and token ids to produce embeddings". The input to TextFieldEmbedder
is a dictionary, not a tensor, and it embeds all of the tensors in the dictionary. If you can be a little more specific on exactly what you're trying to do, maybe I could help you better.
Sorry for not being clear. For example, I would like to move the high way layer computation into the _text_field_embedder
. And configure this so that I could use it as a text_field_embedder in other models without modifying code.
I think that all I would have to do is slightly extend the BasicTextFieldEmbedder to do some more stuff before returning (here). But just making sure I'm not missing something that's already there.
Thanks for all the help!
What do you want the highway layer to apply to? The CNN? You can use a different encoder inside of the TokenCharactersEmbedder
, which might get what you want. Otherwise I'm not sure what you mean by "most the highway layer into the text field embedder". I'm pretty sure what we have already does what you want. If you give me specific equations, or desired code, or just something more precise, I can tell you how to accomplish what you're looking for.
For example, (forgetting about highway for simplicity) I would like to apply a linear layer on top of the concatenated representation.
Specifically I want to embed a sequence of words {x_1, ..., x_N}
as {e_1, ..., e_N}
, where e_i = ReLu(W * h_i + b)
with h_i = [GloVe(x_i); CharCNN(x_i)]
.
In this case the output {e_1, ..., e_N}
of the embedder should be function of both the full token id and sub-word character ids, rather than just a concatenation like [GloVe(x_i); CharCNN(x_i)]
, which is what the Bidaf embedder returns.
Code-wise, I'm wondering if this is possible to do without writing in:
concatenated_representation = self._text_field_embedder(my_text_input)
combined = self._my_module(concatenated_representation)
for every downstream model that I would like to use this specific embedder for.
I think I just got confused between the specific implementation of the BasicTextFieldEmbedder vs the general abstract TextFieldEmbedder. The latter certainly seems flexible enough for me to use.
Ok, thanks for the detail. The most straightforward thing to do is to just have two lines of code in your model, as you suggest (and as we do with BiDAF). But, yeah, if you really want to remove the additional line of code (or have the specific transformation be more configurable from a JSON file), you could write your own TextFieldEmbedder
that does what BasicTextFieldEmbedder
does, then adds on whatever transformation on top of the concatenated representations that you want. You then register that TextFieldEmbedder
, and you can use it with a single line of code in your model. I'm closing this issue now; if you have more questions, feel free to re-open it.
Thanks for the help @matt-gardner ! Appreciate it. Good stuff!
Hi,
Thanks for the very cool work! It looks like if you want to write a text_field_embedder, you can only take one input --- the batched tensor from the corresponding text field indexer.
If I wanted to write a text field embedder that took multiple inputs, say token ids and character ids, can I do that?