omwn / omw-data

This packages up data for the Open Multilingual Wordnet
38 stars 3 forks source link

omw-data

This packages up data for the Open Multilingual Wordnet. It is roughly the version that is described in Bond and Foster (2013).

It includes the data in the original OMW 1.0 format, and in the packaged up in the GWA format for OMW 2 as a release.

It can be used by the Python library Wn.

The raw data (under wns) also has the automatically extracted data for over 150 languages from Wiktionary and the Unicode Common Locale Data Repository (CLDR).

Citation

If you use OMW please cite both the citation below, and the individual wordnets (citation data is included in each wordnet):

Francis Bond and Ryan Foster (2013) Linking and extending an open multilingual wordnet. In 51st Annual Meeting of the Association for Computational Linguistics: ACL-2013. Sofia. 1352–1362

Notes

The directory wns has the wordnet data from OMW 1.2 with some small fixes

By default the label is the name of the project. If the project has multiple wordnets, then the language is added in parentheses. E.g.:

label = "Multilingual Central Repository (Catalan)"

The package name (and id) for each wordnet is, by default, omw-lg, with the following exceptions:

We thanks the developers of all of the wordnets! More recent versions are available for many of these.

Francis Bond and Ryan Foster (2013) Linking and extending an open multilingual wordnet. In 51st Annual Meeting of the Association for Computational Linguistics: ACL-2013. Sofia. 1352–1362