nikitakit / self-attentive-parser

High-accuracy NLP parser with models for 11 languages.
https://parser.kitaev.io/
MIT License
861 stars 153 forks source link

Error loading German model #103

Open tannonk opened 1 year ago

tannonk commented 1 year ago

Hi,

I'd like to use benepar to parse German, however, when trying to add the German model to spacy's nlp_pipe, I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../lib/python3.10/site-packages/spacy/language.py", line 814, in add_pipe
    pipe_component = self.create_pipe(
  File ".../lib/python3.10/site-packages/spacy/language.py", line 702, in create_pipe
    resolved = registry.resolve(cfg, validate=validate)
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 756, in resolve
    resolved, _ = cls._make(
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 805, in _make
    filled, _, resolved = cls._fill(
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 877, in _fill
    getter_result = getter(*args, **kwargs)
  File ".../lib/python3.10/site-packages/benepar/integrations/spacy_plugin.py", line 176, in create_benepar_component
    return BeneparComponent(
  File ".../lib/python3.10/site-packages/benepar/integrations/spacy_plugin.py", line 116, in __init__
    self._parser = load_trained_model(name)
  File ".../lib/python3.10/site-packages/benepar/integrations/downloader.py", line 34, in load_trained_model
    parser = ChartParser.from_trained(model_path)
  File ".../lib/python3.10/site-packages/benepar/parse_chart.py", line 186, in from_trained
    parser.load_state_dict(state_dict)
  File ".../lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ChartParser:
        Unexpected key(s) in state_dict: "pretrained_model.embeddings.position_ids".

To reproduce:

import spacy
import benepar

nlp = spacy.load('de_core_news_md')
nlp.add_pipe("benepar", config={"model": "benepar_de2"})

Libraries:

torch                    2.0.1
torch-struct             0.5
spacy                    3.6.1
benepar                  0.2.0

If I swap out the models for their English counterparts (en_core_web_md, benepar_en3), it runs fine. Any intuitions on why the German model fails to load?

tannonk commented 1 year ago

Update: I found a simple work around. Adding the strict=False argument on L186 in parse_chart.py seems to suffice.

sujoung commented 1 year ago

I think I got the same error recently, for me it worked when I downgraded my transformaers library to transformers==4.30.2

theDebbister commented 11 months ago

I had the same error and downgrading the transformers library and in addition downgrading protobuf to protobuf==3.20.0 worked.

bisserai commented 8 months ago

Thanks this fixed the same issue in French too !