Closed ghaddarAbs closed 6 years ago
fails to pre-process data from test folder. I think it is because the v9 test doesn't contain //_gold_conll files
You must use the v4 train/dev/test split. Here are the stats:
num train examples: 59924 num train tokens: 1088503
num dev examples: 8262 num dev tokens: 152728
V4 test split is not available in the website , can you share here
Unfortunately i can't.... the dataset is copyrighted!!! However you can download ontonotes from ldc (its free!!!) Then you have to follow the instructions of the conll-2012 shared task data preprocessing
I have access to ontonotes 5.0 In the website conll-2012 shared task data preprocessing only test v9 is avialble
What you are looking for is Test Key directly below Test Data
got it.Thanks
Do you have any scripts that can extract all sentences and Labels from complete Ontonotes 5.0 dataset ?
Thanks!
Make the preprocessing of ontonotes compatible with the directory structure produced by skeleton2conll.sh script of conll-2012 shared task