right-branching result do not match with that presented in the paper

yikangshen / PRPN

Parsing Reading Predict Network

MIT License

97 stars 25 forks source link

right-branching result do not match with that presented in the paper #3

Open SyahX opened 5 years ago

SyahX commented 5 years ago

Hi, I am trying to compute right-branching and upper bound baselines on the WSJ10 dataset. When I use the code in prpn to evaluation, I get 56.68 (right-branching) and 84.06 (upper bound), different from 61.7(RBranch) 88.1(upper bound) in the paper. But when I use EvalB to do the evaluation, I get 61.7(RBranch) 88.1(upper bound). So is that mean the way to evaluate the right-branching structure is different from prpn model? Could you please show the right way to compute right-branching and upper bound baselines on the WSJ10 dataset?

Thanks, Yunfan

yoonkim commented 5 years ago

I think the discrepancy is due to sentence-level F1 (adopted by PRPN) vs corpus-level F1 (adopted by EVALB and previous works). Thus the numbers are not exactly comparable, though they are in the general ballpark. There does seem to be a lack of consistency across grammar induction papers (preprocessing, evaluation metric, including sentence-level span vs not, etc.) to make inter-paper comparison difficult to say the least.

Quick question, what did you get for the right branching baselines on the entire dataset?

SyahX commented 5 years ago

40.348, right branching baseline for WSJ40

yikangshen commented 5 years ago

Thanks for pointing this out. I am using the baseline results from previous papers. Could you push the code for evaluation with EvalB?