Closed ZhixiuYe closed 6 years ago
Hi, thanks for asking. I guess you might got the wrong version of the WSJ-PTB corpus. I just counted the sentence number for the training set, and i believe the reported number is right (also, it's consistent with other papers).
Hi, I have got the treebank_3\tagged\pos\wsj corpus. But after I process this corpus to conll format, I get sentence numbers of train, dev and test 37544, 5642 and 6540, which is not consistent to your paper. I wonder what you have done to preprocess the wsj porpus. Thank you!