yzhangcs / parser

:rocket: State-of-the-art parsers for natural language.
https://parser.yzhang.site/
MIT License
829 stars 141 forks source link

Reproduce the results of CTB #22

Closed wangxinyu0922 closed 4 years ago

wangxinyu0922 commented 4 years ago

Hi, I failed to reproduce the results of biaffine parser in Chinese Tree Bank. I tried simply using the default config file with CTB datasets, but I got a LAS of 85~86. Which is about 3 LAS lower than the result produced by the Dozat's parser. Any suggestions for this? Maybe I miss something in data pre-processing?

yzhangcs commented 4 years ago

For the English PTB-SD datasets, we use POS tags generated from the Stanford POS tagger (Toutanova et al., 2003); for the Chinese PTB dataset we use gold tags; and for the CoNLL09 dataset we use the provided predicted tags.

Did you use gold tags?

wangxinyu0922 commented 4 years ago

For the English PTB-SD datasets, we use POS tags generated from the Stanford POS tagger (Toutanova et al., 2003); for the Chinese PTB dataset we use gold tags; and for the CoNLL09 dataset we use the provided predicted tags.

Did you use gold tags?

Oh, that's right, I didn't use the pos tag because the code requires to choose one of the feature from char/pos/bert (am I right?). I will try the model with pos+char.

yzhangcs commented 4 years ago

The vanilla biaffine parser only uses POS tags. So I think pos is enough.