nltk / nltk_data

NLTK Data
1.45k stars 1.04k forks source link

Packages missing a corresponding .xml file #178

Closed ekaf closed 2 years ago

ekaf commented 2 years ago

Sometimes, zipped packages miss a corresponding .xml file. Then, 'make pkg_index' ignores these packages silently and doesn't add them to the index, so they cannot be downloaded.

Wouldn't it be nice if this situation could trigger a warning, without necessarily raising an error? It only requires a few additional lines in nltk/downloader.py to produce the following warnings:

/usr/lib64/python3.9/site-packages/nltk/downloader.py:2422: UserWarning: Missing ./packages/corpora/omw-1.4.xml! warnings.warn(f"Missing {xmlfilename}!") /usr/lib64/python3.9/site-packages/nltk/downloader.py:2422: UserWarning: Missing ./packages/corpora/ptb3.xml! warnings.warn(f"Missing {xmlfilename}!") /usr/lib64/python3.9/site-packages/nltk/downloader.py:2422: UserWarning: Missing ./packages/corpora/wordnet2021.xml! warnings.warn(f"Missing {xmlfilename}!") /usr/lib64/python3.9/site-packages/nltk/downloader.py:2422: UserWarning: Missing ./packages/corpora/listing.csv.xml! warnings.warn(f"Missing {xmlfilename}!")