SeldonIO / alibi

Algorithms for explaining machine learning models
https://docs.seldon.io/projects/alibi/en/stable/
Other
2.39k stars 249 forks source link

AnchorText spacy v3.4.1 error #804

Open RobertSamoilescu opened 1 year ago

RobertSamoilescu commented 1 year ago

spacy v3.4.1 (possible >=3.4.1) raises the following error:

ValueError                                Traceback (most recent call last)
Input In [30], in <cell line: 1>()
----> 1 explainer = AnchorText(predictor=predict_fn, sampling_strategy='similarity', nlp=nlp)

File ~/miniconda3/envs/scv2/lib/python3.9/site-packages/alibi/explainers/anchors/anchor_text.py:189, in AnchorText.__init__(self, predictor, sampling_strategy, nlp, language_model, seed, **kwargs)
    186 self.model: Union['spacy.language.Language', LanguageModel]  #: Language model to be used.
    188 # validate kwargs
--> 189 self.perturb_opts, all_opts = self._validate_kwargs(sampling_strategy=sampling_strategy, nlp=nlp,
    190                                                     language_model=language_model, **kwargs)
    192 # set perturbation
    193 self.perturbation: Any = \
    194     self.CLASS_SAMPLER[self.sampling_strategy](self.model, self.perturb_opts)  #: Perturbation method.

File ~/miniconda3/envs/scv2/lib/python3.9/site-packages/alibi/explainers/anchors/anchor_text.py:225, in AnchorText._validate_kwargs(self, sampling_strategy, nlp, language_model, **kwargs)
    222         raise ValueError("spaCy model can not be `None` when "
    223                          f"`sampling_strategy` set to `{sampling_strategy}`.")
    224     # set nlp object
--> 225     self.model = load_spacy_lexeme_prob(nlp)
    226 else:
    227     if language_model is None:

File ~/miniconda3/envs/scv2/lib/python3.9/site-packages/alibi/explainers/anchors/text_samplers.py:114, in load_spacy_lexeme_prob(nlp)
    112     if 'lexeme_prob' not in nlp.vocab.lookups.tables:
    113         from spacy.lookups import load_lookups
--> 114         lookups = load_lookups(nlp.lang, ['lexeme_prob'])  # type: ignore[arg-type]
    115         nlp.vocab.lookups.add_table('lexeme_prob', lookups.get_table('lexeme_prob'))
    117 return nlp

File ~/miniconda3/envs/scv2/lib/python3.9/site-packages/spacy/lookups.py:30, in load_lookups(lang, tables, strict)
     28 if lang not in registry.lookups:
     29     if strict and len(tables) > 0:
---> 30         raise ValueError(Errors.E955.format(table=", ".join(tables), lang=lang))
     31     return lookups
     32 data = registry.lookups.get(lang)

ValueError: [E955] Can't find table(s) lexeme_prob for language 'en' in spacy-lookups-data. Make sure you have the package installed or provide your own lookup tables if no default lookups are available for your language.

This error does not appear in version 3.2.1. Might be related to #255.

RobertSamoilescu commented 1 year ago

Cannot reproduce the error with 3.4.1 nor 3.4.2. Probably a dependency changed.