nltk / nltk_data

NLTK Data
1.45k stars 1.04k forks source link

fix(index.xml): unzip corpora so they are found by nltk.data.find #200

Closed tuky closed 3 months ago

tuky commented 1 year ago

to fix https://github.com/nltk/nltk/issues/3028

ekaf commented 1 year ago

Unzipping corpora is not a fix, but a temporary workaround to handle occasional cases when a corpus reader does not properly handle zipped data. As it is increasingly common to use small devices with very limited storage space, it is preferable to fix the corpus reader rather than unzipping the data.