Closed tomaarsen closed 2 years ago
Success! The changes seem to work. I experience no more issues on Windows and on Google Colab (Linux) personally.
cc-ing some relevant devs as this might be of interest to you all: @stevenbird @nimbusaeta @pratos
Hello!
I'll keep this brief. #169 added another speech to the Inaugural dataset, but also turned
inaugural.zip
into a zip with the files directly, rather than a folder calledinaugural
which contains the files. The latter is how all corpora ought to be. https://github.com/nltk/nltk_data/issues/173#issuecomment-984970634 mentions this. As it turns out, using the most recentnltk
does allow installing, but does not allow usinginaugural
in code.It is not possible to force the downloader to install
inaugural
from e.g.tomaarsen/nltk_data
, so it's quite tricky to test this PR. That said, the current system simply does not work, so I feel obligated to simply merge this in the hopes that it does indeed resolve the issue.The new
inaugural.zip
contains a folder with the files, rather than the files directly. The line endings on the new2021-Biden.txt
were also turned to Unix.References
169: Source of the bug.
173: First report of the bug.