Closed jsoladur closed 3 years ago
Hello @josemariasoladuran
The lemmatization generation process has not changed, but the downloaded data did and it caused an endless loop in the code. This is why the "Generating lemmatization..." process never ended. This bug has been fixed and the process should now end normally.
Thanks for reporting this bug. Let me know if the problem persists.
We have a Docker image, in which when building it we execute the command:
python -m spacy_spanish_lemmatizer download wiki
In previous weeks, this command it was slow but not as much as now. Now, the docker build image, not finish command after 2 hours. Before in about 40 minutes the image was compiled...
`user>@<user-zenbook-ubuntu:~$ python3 -m spacy_spanish_lemmatizer download wiki
Downloading wiktionary dump from: https://dumps.wikimedia.org/eswiktionary/latest/eswiktionary-latest-pages-articles.xml.bz2 (it may take some time)
Decompressing dump file: /home//.local/lib/python3.8/site-packages/spacy_spanish_lemmatizer/tmp/eswiktionary-latest-pages-articles.xml.bz2
Parsing downloaded file...
Generating lemmatization... ... ... ... `
What happened to the wiki? Has the lemmatization generation process been modified that now takes longer? How can we solve this problem?
Thank a lot. Best regards