centre-for-humanities-computing / DaCy

DaCy: The State of the Art Danish NLP pipeline using SpaCy
https://centre-for-humanities-computing.github.io/DaCy/
Apache License 2.0
92 stars 20 forks source link

dacy.load fails to install the model in pip 24.1 #288

Open pdworzynski opened 2 months ago

pdworzynski commented 2 months ago

How to reproduce the behaviour

With pip >= 24.1 installed do:

import dacy
dacy_nlp = dacy.load("da_dacy_medium_trf-0.2.0")

This fails with:

Defaulting to user installation because normal site-packages is not writeable
ERROR: Invalid requirement: 'da-dacy-medium-trf==any': Expected end or semicolon (after name and no valid version specifier)
    da-dacy-medium-trf==any
                      ^
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/piotr/.local/lib/python3.11/site-packages/dacy/load.py", line 37, in load
    path = download_model(model, force=force)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piotr/.local/lib/python3.11/site-packages/dacy/download.py", line 118, in download_model
    install(models_url[model])
  File "/home/piotr/.local/lib/python3.11/site-packages/dacy/download.py", line 81, in install
    subprocess.check_call(
  File "/opt/anaconda3/envs/jupyterhub-env-2/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/jupyterhub-env-2/bin/python', '-m', 'pip', 'install', 'https://huggingface.co/chcaa/da_dacy_medium_trf/resolve/e7dba91f855a1d26679dc1ef3aa49f7874b50543/da_dacy_medium_trf-any-py3-none-any.whl', '--no-deps']' returned non-zero exit status 1.

Installing model package via the new pip also fails:

pip install https://huggingface.co/chcaa/da_dacy_medium_trf/resolve/e7dba91f855a1d26679dc1ef3aa49f7874b50543/da_dacy_medium_trf-any-py3-none-any.whl

Most likely cause is that pip 24.1 introduced a more rigorous versions specification for packages. This link to medium article offers a good writeup of the changes.

Unfortunately, i'm very far from a subject-matter expert. The problem seems to originate in huggingface model definition. Could this be a common issue for spacy models on hugging face? I found an un-answered stack overflow question with the same problem here.

Your Environment

KennethEnevoldsen commented 2 months ago

Thanks for pointing this out @pdworzynski. I sadly won't have time to fix this before next month.

However, your issue seems to also be present for SpaCy's pipelines. So I have created an issue on their repo, which you might want to follow.

A temporary solution is to downgrade pip:

pip install "pip<22"
pip install https://huggingface.co/spacy/en_core_web_sm/resolve/main/en_core_web_sm-any-py3-none-any.whl
pdworzynski commented 2 months ago

Hi Kenneth,

Thanks for confirming that this is indeed SpaCy issue. As you suggested, I downgraded pip and already pip==24.0 doesn't exhibit this issue.

Cheers