TakeLab / spacy-udpipe

spaCy + UDPipe
MIT License
159 stars 11 forks source link

How do I customise the model behaviour and save locally? #41

Open joakimwar opened 2 years ago

joakimwar commented 2 years ago

Hi. I am using this library to lemmatize in Swedish, and I would like to be able to load the model from a local directory, and to make iterative improvements when I find mistakes in the lemmatization. With regular spaCy models I can simply edit the exclusions table and save and load from disk using the to_disk and from_disk methods,

nlp = spacy.load(local_path)
lemmatizer = nlp.get_pipe('lemmatizer')
lemmatizer.lookups.get_table("lemma_exc")["noun"]["word"] = ["whatever"]
nlp.to_disk(local_path)

Is there any equivalent for models from spacy_udpipe? I see the load_from_path method, but can I make changes to the lemmatization, and how do I save the model locally? In regards to saving the model, this does not work for me:

spacy_udpipe.download('sv')
nlp = spacy_udpipe.load('sv')
nlp.to_disk('my_model')
nlp_from_local = spacy_udpipe.load_from_path(lang='sv', path='my_model')

Trying to use the nlp_from_local object on a text gives me

AttributeError: 'NoneType' object has no attribute 'newTokenizer'