Paper Reference Question

qute012 commented 3 years ago

Thank you for great works!

However, i'm confused in your paper. This paragraph says LSTM doesn't have long-range distance. But i can't find describing that in Learning Deep Architectures for AI that you referred. It just explains about RNN not LSTM. Can you explain about that?

However, sequence models such as bidirectional LSTM (Hochreiter and Schmidhuber, 1997) are not able to fully capture the long-range dependencies (Bengio, 2009).

xuuuluuu commented 3 years ago

Hi,

Thanks for the question. I think there is some misunderstanding here. The sentence does not express the opinion that LSTM doesn't have a long-range distance. What I want to emphasize is that sequence models are not the best option if you want to capture long-range dependencies.

ps, although LSTM is better than RNNs to capture those long-range dependencies, they also show significant locality bias( (Lai et al., 2015; Linzen et al., 2016). You can also see other discussions about the long-range dependencies issue (zhang et al., 2018, shen et al., 2019).

Reference

Capturing Long-range Contextual Dependencies with Memory-enhanced Conditional Random Fields
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Sentence-State LSTM for Text Representation
ORDERED NEURONS: INTEGRATING TREE STRUCTURES INTO RECURRENT NEURAL NETWORKS
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum

qute012 commented 3 years ago

I got you! I understood as that before explain. I just want to know surely. I think it can be explained as explicit capturing dependency.

Thanks to reply.

xuuuluuu / SynLSTM-for-NER

Paper Reference Question #2