curiosity-ai / catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
MIT License
699 stars 71 forks source link

Add support for Dutch and French WordNets #111

Open oktaal opened 1 month ago

oktaal commented 1 month ago

The current implementation of Catalyst only support Stanford WordNets for English. This adds support for mapping WordNets to other languages (using a new class WordNetMapping) and exposes the translations and the original English WordNet data using the uniform interfaces IWordNet and IWordNetData. The translations should follow the format as used by the Open Multilingual WordNet which maps each synset to one or multiple translations e.g. for cyclist:

09986189-n  nld:lemma   peddelaar
09986189-n  nld:lemma   fietser
09986189-n  nld:lemma   wielrenner

Related to #35