Open allanj opened 7 years ago
This should not be the input to your BiLSTM, you first use a LookupTable
to encode character indeces into character vectors. Read 4.1 Character-based models of words paragraph from your paper again.
Yes. Sorry I didn't put an embedding layer before that. The problem is still the same, so the network now becomes the (embedding layer + BiLSTM). Still the same input.
But the problem exists.
I want to implement character embedding with BiLSTM as in this paper(Neural Architectures for Named Entity Recognition Guillaume) . Specifically, I give the characters of a word as input to a BiLSTM. Then I concatenate the last hidden state of the forward LSTM and the first hidden state of the backward LSTM, that's the result I want.
However, I found it would be hard if I have the variable length of words.
Let's say the word contains 3 characters (1 2 and 3), the maximal length is 5. So, the input to the BiLSTM will be the embeddings of the following tokens:
But, if I want to take the last hidden state, it would become
0
since the last hidden state is padded by0
. I couldn't let the model know to get the third position as a different sentence has a different position.