curiosity-ai / catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
MIT License
715 stars 73 forks source link

Help with AveragePerceptronEntityRecognizer for Danish #52

Closed jeppesc11 closed 3 years ago

jeppesc11 commented 3 years ago

Hi,

Maybe it's i'm just a noob, but i can't figure out to get/create a model-WikiNER-v000000.binz for Danish AveragePerceptronEntityRecognizer?

Can you be helpful with this? Thank you! :-)

theolivenbaum commented 3 years ago

Hi @jeppesc11,

As I just answered on #53 , the WikiNER model is only available for a subset of the languages.

If you want to train your own, the best start is trying to find a dataset training NER models for the language you need.

If it's a public data-set, happy to include in the training code and publish it to NuGet.