Open fcbond opened 1 year ago
We could go two ways with synsets like moke "British informal for donkey"
ir_synonym
and make sure both sides have the same translationstake from merges in oewn
@fcbond, this sounds ambiguous, and may not be optimal: merges are relative to a target English Wordnet version, so you would for ex. pick either OEWN 2021 or 2022, and then deal with different merges in later OEWN versions? It might be better not to handle the merges in OMW-data: NLTK now handles OMW merges seamlessly with any OEWN version, and @goodmami might eventually consider a similar approach in Wn for solving the related issue https://github.com/goodmami/wn/issues/179
merge, and mark the senses with the dialect and register tags so moke is in donkey but marked with Domain-Region united_kingdom and exemplifies informal
I prefer this option
Also consider fixing #32 for this release.
@goodmami might eventually consider a similar approach in Wn for solving the related issue https://github.com/goodmami/wn/issues/179
The issue is no longer fresh in my mind, but I don't think I was planning on making any significant changes to Wn. More likely I would suggest some documentation about how to deal with such merges, such as using the code snippet I wrote in that issue. But I should first check out how it was handled in the NLTK.
If a 1.5 version is still on the agenda, let's consider adding pre-3.0 versions of the Princeton WordNet data (see https://github.com/goodmami/wn/issues/199).
I am thinking I will probably not try to do too much here: identifying variants should really be done in the language project (so in OEWN for English).
These are the minimum I would like to see for this:
Most of these are close to done, I need to push out for review, ...