Closed lipsa-vlad closed 2 years ago
@lipsa-vlad, yes, I guess this behaviour can be expected, because there are limits to how much compatibility you can maintain with a version as old as NLTK 2.3.4. The problem could be that the wordnet library of that time probably lacked support for zipped data. For users who cannot upgrade to a more recent NLTK, it seems ok to just unzip the data, as you did.
thanks, I guess I'll have to upgrade then.
It would be better if we could detect that you run an old NLTK version, and provide a more informative error message, or even better, just unzip the package automatically. But old NLTK versions are frozen, and stay as they are, so they cannot be improved. The only possible alternative would be to just unzip everything by default, which is not ideal neither. So yes, the recommended action is to upgrade to the current NLTK version, which is probably not difficult, since you are already running Python 3.8.
The difficulty is that as @ekaf mentioned, old NLTK versions are frozen, but for nltk_data we only expose the most recent version. This means that old NLTK versions can stop working as nltk_data is updated and changed. This is a consequence of how we host the nltk_data.
Indeed, not much to do from your side if the old version is frozen.
Even after running running nltk.download() I get the following error from nltk. I checked the nltk_data folder, the wordnet package is zipped. If I unzip it everything works.
`File \"/usr/local/lib/python3.8/site-packages/nltk/stem/wordnet.py\", line 40, in lemmatize lemmas = wordnet._morphy(word, pos) File \"/usr/local/lib/python3.8/site-packages/nltk/corpus/util.py\", line 116, in getattr self.load() File \"/usr/local/lib/python3.8/site-packages/nltk/corpus/util.py\", line 81, in load except LookupError: raise e File \"/usr/local/lib/python3.8/site-packages/nltk/corpus/util.py\", line 78, in load root = nltk.data.find('{}/{}'.format(self.subdir, self.name)) File \"/usr/local/lib/python3.8/site-packages/nltk/data.py\", line 653, in find raise LookupError(resource_not_found) LookupError:
Resource 'corpora/wordnet' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: