syuoni / eznlp

Easy Natural Language Processing
Apache License 2.0
130 stars 21 forks source link

OntoNotes 5 ENG data #30

Closed Qznan closed 1 year ago

Qznan commented 1 year ago

Where is the OntoNotes5.0-Eng data demo and loading code as it has been reported in the Boundary Smooth paper

syuoni commented 1 year ago

Hi,

The original dataset is on LDC: https://catalog.ldc.upenn.edu/LDC2013T19. You may need a permission to download it.

Data processing follow Pradhan et al. (2013). You may also check https://conll.cemantix.org/2012/data.html or https://github.com/yhcc/OntoNotes-5.0-NER. After processing, the data will be in the CoNLL-format.

Our loading code is in this block: https://github.com/syuoni/eznlp/blob/master/scripts/utils.py#L175.

Enwei