NorskRegnesentral / skweak

skweak: A software toolkit for weak supervision applied to NLP tasks
MIT License
918 stars 73 forks source link

Comment fix: SpaCy NER was trained on OntoNotes #27

Closed ruanchaves closed 2 years ago

ruanchaves commented 2 years ago

Replace "ConLL 2003" by "OntoNotes 5.0", which is the actual dataset in which en_core_web_sm has been trained. ( read the docs: https://spacy.io/models/en )

quick_start.ipynb:

image

spaCy docs:

image

plison commented 2 years ago

Thanks!