Closed goodmami closed 2 years ago
This looks like a good idea. I will change the identifiers to PRJ-LG, with the following exceptions:
omw-iwn
not omw-it
(used by multiwordnet)omw-cmn
not omw-cmn-Hans
omw-en
omw-en31
I notice that the directory names follow the lexicon IDs in general but not for the English wordnets:
omw-1.4/omw-arb/omw-arb.xml
omw-1.4/omw-bg/omw-bg.xml
omw-1.4/en30/omw-en.xml
omw-1.4/en31/omw-en31.xml
Shall we normalize those?
omw-1.4/omw-en/omw-en.xml
omw-1.4/omw-en31/omw-en31.xml
I did this as they are built differently: the English wordnets are built directly from the WN db but the others from the tab files.
But I guess the change in name does not really make this clear, so I will harmonize them.
On Wed, Nov 3, 2021 at 6:09 AM Michael Wayne Goodman < @.***> wrote:
I notice that the directory names follow the lexicon IDs in general but not for the English wordnets:
omw-1.4/omw-arb/omw-arb.xml omw-1.4/omw-bg/omw-bg.xml omw-1.4/en30/omw-en.xml omw-1.4/en31/omw-en31.xml
Shall we normalize those?
omw-1.4/omw-en/omw-en.xml omw-1.4/omw-en31/omw-en31.xml
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bond-lab/omw-data/issues/15#issuecomment-958245296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRXBFDBU7PIIAZFHSUDUKBVSLANCNFSM5GME7FBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University
In discussing the IDs and names of the
own-pt
andown-en
lexicons under the OpenWordnet umbrella (in goodmami/wn#97), I've come to like the regularity of the PRJ-LG naming scheme, and I wonder if we could do the same for OMW instead of the current LNGwn scheme. For instance:alswn
->omw-sq
(sq
is preferred for BCP-47 overals
orsqi
, unless it's specifically the Tosk dialect)porwn
->omw-pt
(this one may be confused withown-pt
, however)spawn
->omw-es
(alas, the predictable world is more boring)If
omw
is in the identifier, then we can probably drop it from the versioning scheme. E.g.,islwn:1.3+omw
becomesomw-is:1.3
. Those distributed by OMW but which have their own project names (such asiwn
for the Italian Wordnet or maybewn30
/wn31
for the Princeton WordNet- derived ones) should keep their current IDs.Thoughts?