nltk / nltk_data

NLTK Data
1.4k stars 1.03k forks source link

Add `omw-1.4.xml` to allow OMW 1.4 to be downloaded #176

Closed tomaarsen closed 2 years ago

tomaarsen commented 2 years ago

Hello!

Pull request overview

In https://github.com/nltk/nltk_data/pull/175, we reverted omw back to the original, and created omw-1.4.zip. At the same time, @ekaf made https://github.com/nltk/nltk/pull/2907, which uses omw-1.4 as the main Wordnet resource from now onwards. This preserves compatibility for older versions. See https://github.com/nltk/nltk/issues/2905 for more discussion surrounding OMW and OMW 1.4.

Currently the issue is that https://github.com/nltk/nltk/pull/2907 is unable to download omw-1.4, as this resource is not downloadable:

> python -m nltk.downloader omw-1.4
...
[nltk_data] Error loading omw-1.4: Package 'omw-1.4' not found in
[nltk_data]     index
Error installing package. Retry? [n/y/e]

This is because omw-1.4.xml was missing, and omw-1.4 was missing from index.xml. This PR adds those files.

I would like to ask @goodmami, @fcbond and/or @ekaf to review the packages/corpora/omw-1.4.xml. Currently, it's a near copy of omw.xml, with only the id modified from omw to omw-1.4.

If this is merged, we can move forward with https://github.com/nltk/nltk/pull/2907, which allows us to use OMW 1.4 from the next release onwards.

ekaf commented 2 years ago

It looks good, except that @fcbond mentioned in https://github.com/nltk/nltk_data/pull/171#issuecomment-985951201:

The URL for the project should change to https://omwn.org/, as compling will be retired soon. I plan to start making the new site useful from next week.

tomaarsen commented 2 years ago

@fcbond Do we want to update the webpage parameter from both omw and omw-1.4 to https://omwn.org/ already, or should we keep them as the compling pages for the time being?

Of course, if we opt to keep them as compling for now, then you can always create a PR or ask us to change the webpage value, and it can be changed in no time.

stevenbird commented 2 years ago

Let's go with the new URL.

fcbond commented 2 years ago

Yes, please go with the new URL

On Fri, 10 Dec 2021, 6:52 pm Steven Bird, @.***> wrote:

Let's go with the new URL.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nltk/nltk_data/pull/176#issuecomment-990862751, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRVSZWTZ3JAIRLSOLBLUQHLU7ANCNFSM5JWLH6GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

tomaarsen commented 2 years ago

Glad to see this consensus! I've made the changes. Should be all set now.