omwn / omw-data

This packages up data for the Open Multilingual Wordnet
40 stars 3 forks source link

Cleaner ID scheme for OMW wordnets? #15

Closed goodmami closed 2 years ago

goodmami commented 2 years ago

In discussing the IDs and names of the own-pt and own-en lexicons under the OpenWordnet umbrella (in goodmami/wn#97), I've come to like the regularity of the PRJ-LG naming scheme, and I wonder if we could do the same for OMW instead of the current LNGwn scheme. For instance:

If omw is in the identifier, then we can probably drop it from the versioning scheme. E.g., islwn:1.3+omw becomes omw-is:1.3. Those distributed by OMW but which have their own project names (such as iwn for the Italian Wordnet or maybe wn30/wn31 for the Princeton WordNet- derived ones) should keep their current IDs.

Thoughts?

fcbond commented 2 years ago

This looks like a good idea. I will change the identifiers to PRJ-LG, with the following exceptions:

goodmami commented 2 years ago

I notice that the directory names follow the lexicon IDs in general but not for the English wordnets:

omw-1.4/omw-arb/omw-arb.xml
omw-1.4/omw-bg/omw-bg.xml
omw-1.4/en30/omw-en.xml
omw-1.4/en31/omw-en31.xml

Shall we normalize those?

omw-1.4/omw-en/omw-en.xml
omw-1.4/omw-en31/omw-en31.xml
fcbond commented 2 years ago

I did this as they are built differently: the English wordnets are built directly from the WN db but the others from the tab files.

But I guess the change in name does not really make this clear, so I will harmonize them.

On Wed, Nov 3, 2021 at 6:09 AM Michael Wayne Goodman < @.***> wrote:

I notice that the directory names follow the lexicon IDs in general but not for the English wordnets:

omw-1.4/omw-arb/omw-arb.xml omw-1.4/omw-bg/omw-bg.xml omw-1.4/en30/omw-en.xml omw-1.4/en31/omw-en31.xml

Shall we normalize those?

omw-1.4/omw-en/omw-en.xml omw-1.4/omw-en31/omw-en31.xml

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bond-lab/omw-data/issues/15#issuecomment-958245296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRXBFDBU7PIIAZFHSUDUKBVSLANCNFSM5GME7FBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University