diyclassics / la_core_web_lg

spaCy-compatible sm/md/lg/trf core models for Latin, i.e pipeline with POS tagger, morphologizer, lemmatizer, dependency parser, and NER
https://huggingface.co/latincy
MIT License
10 stars 0 forks source link

error loading model: Can't find table(s) lemma_lookup for language 'la' in spacy-lookups-data #1

Open joprice opened 3 months ago

joprice commented 3 months ago

After installing the model with pip install "la-core-web-lg @ https://huggingface.co/latincy/la_core_web_lg/resolve/main/la_core_web_lg-any-py3-none- any.whl", I get the error

ValueError: [E955] Can't find table(s) lemma_lookup for language 'la' in spacy-lookups-data. Make sure you have the package installed or provide your own lookup tables if no default lookups are available for your language.

I see the same issue asked here https://github.com/explosion/spaCy/discussions/13559. Is there perhaps a compatibility issue with recent spacy versions or some other step to get the model to load?

diyclassics commented 3 months ago

Thanks for the Issue—the current pipeline requires a custom fork of spacy-lookups-data (available here: https://github.com/diyclassics/spacy-lookups-data/tree/master) for the lookup-lemmatizer component. This custom fork should be installed as a dependency when the pipeline is pip-installed from Hugging Face. That said, it looks like that does not work if there is an existing installation of spacy-lookups-data. Is this the case? Can you try uninstalling spacy-lookups-data and reinstalling the pipeline and let me know if that works?

pip uninstall spacy_lookups_data
pip install "la-core-web-lg @ https://huggingface.co/latincy/la_core_web_lg/resolve/main/la_core_web_lg-any-py3-none-any.whl"
diyclassics commented 3 months ago

If this does work, I will look into pushing the 'la' lookups data to the main Explosion repo to avoid this going forward.

joprice commented 3 months ago

It does work after removing the lookups version I had installed and installing the fork. Thank you!

kylepjohnson commented 1 month ago

Hi @diyclassics Could I re-open this? You can see my issue linked above. This is a breaking problem for the CLTK, though the issue is only slowly beginning to emerge for people as they update their code and/or models, or do completely new installs.

diyclassics commented 1 month ago

The custom fork of spacy_lookups_data is listed in the requirements for the LatinCy models, cf. https://huggingface.co/latincy/la_core_web_lg/blob/936f034334836ce4c69c9a82ff3457779cc44dc9/meta.json#L1229. As such, this package should install alongside the models themselves. As I mention above, this may not be the case if spacy_lookups_data is already installed—I would have to figure out whether there is a way to force reinstall in this situation.