xuuuluuu / SynLSTM-for-NER

Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.
30 stars 2 forks source link

Paper Reference Question #2

Closed qute012 closed 3 years ago

qute012 commented 3 years ago

Thank you for great works!

However, i'm confused in your paper. This paragraph says LSTM doesn't have long-range distance. But i can't find describing that in Learning Deep Architectures for AI that you referred. It just explains about RNN not LSTM. Can you explain about that?

However, sequence models such as bidirectional LSTM (Hochreiter and Schmidhuber, 1997) are not able to fully capture the long-range dependencies (Bengio, 2009).

xuuuluuu commented 3 years ago

Hi,

Thanks for the question. I think there is some misunderstanding here. The sentence does not express the opinion that LSTM doesn't have a long-range distance. What I want to emphasize is that sequence models are not the best option if you want to capture long-range dependencies.

ps, although LSTM is better than RNNs to capture those long-range dependencies, they also show significant locality bias( (Lai et al., 2015; Linzen et al., 2016). You can also see other discussions about the long-range dependencies issue (zhang et al., 2018, shen et al., 2019).

Reference

  1. Capturing Long-range Contextual Dependencies with Memory-enhanced Conditional Random Fields
  2. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
  3. Sentence-State LSTM for Text Representation
  4. ORDERED NEURONS: INTEGRATING TREE STRUCTURES INTO RECURRENT NEURAL NETWORKS
  5. Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
qute012 commented 3 years ago

I got you! I understood as that before explain. I just want to know surely. I think it can be explained as explicit capturing dependency.

Thanks to reply.