nnop / notes

notes
1 stars 0 forks source link

text classification #135

Open nnop opened 6 years ago

nnop commented 6 years ago

researcher

papers

nnop commented 6 years ago

可视化

nnop commented 6 years ago

网络深度

In most cases, however, performance improvements of making the model deeper than 2 layers are minimal (Reimers & Gurevych, 2017). These observations hold for most sequence tagging and structured prediction problems. For classification, deep or very deep models perform well only with character-level input and shallow word-level models are still the state-of-the-art (Zhang et al., 2015; Conneau et al., 2016; Le et al., 2017)

nnop commented 6 years ago

优化

Adam (Kingma & Ba, 2015) is one of the most popular and widely used optimization algorithms and often the go-to optimizer for NLP researchers. It is often thought that Adam clearly outperforms vanilla stochastic gradient descent (SGD). However, while it converges much faster than SGD, it has been observed that SGD with learning rate annealing slightly outperforms Adam (Wu et al., 2016). Recent work furthermore shows that SGD with properly tuned momentum outperforms Adam (Zhang et al., 2017)

nnop commented 6 years ago

预处理

nnop commented 6 years ago

text classification

codes

nnop commented 6 years ago

知乎·看山杯

nnop commented 6 years ago

发现了一个新思路