Liebeck / spacy-iwnlp

German lemmatization with IWNLP as extension for spaCy
MIT License
23 stars 2 forks source link

OSError: [E050] #3

Closed kylefoley76 closed 4 years ago

kylefoley76 commented 4 years ago

I cannot run this line of code

nlp = spacy.load('de')

This is the error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kylefoley/codes/venv/lib/python3.8/site-packages/spacy/__init__.py", line 30, in load
    return util.load_model(name, **overrides)
  File "/Users/kylefoley/codes/venv/lib/python3.8/site-packages/spacy/util.py", line 175, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'de'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Also, where am I supposed to put the json file here Download the latest processed IWNLP dump from http://lager.cs.uni-duesseldorf.de/NLP/IWNLP/IWNLP.Lemmatizer_20181001.zip and unzip it.

I tried putting it in the same directory that I run the code from. Is that correct?

kylefoley76 commented 4 years ago

Ok, I figured out that the json code has to go here

    iwnlp = spaCyIWNLP(lemmatizer_path='data/IWNLP.Lemmatizer_20181001.json')

I can run this line of code

b = spacy.load('de_core_news_sm')

But I can't run

nlp = spacy.load('de')

Seems rather strange to me.

kylefoley76 commented 4 years ago

Alright, I got it working. I had to write out the full path of the code and changed 'de' to something different

    nlp = spacy.load('de_core_news_sm')
    iwnlp = spaCyIWNLP(lemmatizer_path='/users/kylefoley/codes/venv/lib/python3.8/site-packages/spacy/data/IWNLP.Lemmatizer_20181001.json')
    nlp.add_pipe(iwnlp)
    doc = nlp('Wir mögen Fußballspiele mit ausgedehnten Verlängerungen.')
    for token in doc:
        print('POS: {}\tIWNLP:{}'.format(token.pos_, token._.iwnlp_lemmas))