Open howardyclo opened 6 years ago
This paper propose a auxiliary bidirectional language modeling objective for neural sequence labeling, and evaluated on error detection in learner texts, name entity recognition (NER), chunking and part-of-speech (POS) tagging.
Notes on the name of this paper:
Since bidirectional LSTM has access to the full context on each side of the target token, they predict the next word only from the forward-moving hidden state and the previous word only from the backward-moving hidden state. (The hidden states are mapped to 1-layer, tanh projection before projecting to context word vocabulary using softmax)
Metadata