Open mprorock opened 2 years ago
I unfortunately don't have the exact details to provide you, @DeNeutoy might remember some more details, but I believe it was the same splits/processing spacy uses which appear to be referenced in a couple places (https://github.com/explosion/spaCy/issues/5276, https://github.com/explosion/spaCy/issues/3587#issuecomment-483191672).
Since ontonotes requires direct licensing from source are there any pointers or scripts to prep for how to convert the corpus format over to the expected train / dev / test splits so that
ud_ontonotes.tar.gz
can be properly replicated locally?