DoodleJZ / HPSG-Neural-Parser

Source code for "Head-Driven Phrase Structure Grammar Parsing on Penn Treebank" published at ACL 2019
https://arxiv.org/abs/1907.02684
MIT License
107 stars 25 forks source link

Experiment with XLnet? #2

Open LifeIsStrange opened 5 years ago

LifeIsStrange commented 5 years ago

Firstly I would like to say that reading your paper was fascinating. Secondly I would like to thank you for advancing the state of the art on both constituency parsing and dependency parsing. (first place on NLP-progress)

I've not yet read all your paper, but it seems you used BERT, and BERT was state of the art, but is no longer. It has been obscoleted by significant margins by [XLnet] (https://github.com/zihangdai/xlnet) I think it would be really interesting to train your neural net with XLnet instead of BERT to see if you can advance even more the state of the art!

LifeIsStrange commented 5 years ago

@DoodleJZ

DoodleJZ commented 5 years ago

YES, thank you for your concern! We are also interesting with the strong performance of XLnet and will consider trying XLnet later.

LifeIsStrange commented 5 years ago

Really nice to hear that! Could you please update the Nlp progress [1] results if this experiment improve state of the art performance, or tell me so I could update results for you.

[1] https://github.com/sebastianruder/NLP-progress/blob/master/english/dependency_parsing.md Thanks in advance.

LifeIsStrange commented 5 years ago

Hi @DoodleJZ I saw that you did the experiment with XLnet, had very successful results and that you merged your results on NLP-progress! ( https://github.com/sebastianruder/NLP-progress/commit/18b8b852d0ae9f084c355488b1cc5868db912630 )

Please, let's not stop here! The world needs high accuracy dep/const parsing and you are the one that can improve the state of the art. You already beated the SOTA twice! Let's do it AGAIN :)

I propose to experiment with two simple, high returns things in addition to XLnet: Firstly, using the state of the art activation function Mish can give high accuracy gains! https://github.com/digantamisra98/Mish

Secondly there are two new state-of-the-art Optimizer in town: RAdam (rectified Adam) and Lookahead. And the beauty is that they can work together synergistically. You should try this SOTA optimizer: https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer (the medium blog is insightful)

Related: https://github.com/mgrankin/over9000