Closed UtkarshKhare closed 6 years ago
This is the entity file used in above code for NAME, COMPANY and DATE .
The problem here is that when you save out the model, spaCy will serialize the data and config – but not your arbitary code like the entity matcher component. When you save out the model, your pipeline in the meta may look something like this: "pipeline": ["parser", "ner", "entity1"]
etc. When you load back the model, spaCy needs to resolve those strings back to the components, so it will look them up in the factories in Language.factories
. This works fine for the built-in components – but not for "entity1"
, because spaCy doesn't know what that is. You read find more about this in the pipeline components documentation.
In the future, spaCy will be solving this problem via entry points, which will let you wrap your component as a Python package and tell spaCy how to resolve the component string names. My PR #2348 will be included in the upcoming nightly release (and v2.1.0) and the description includes a detailed example and some background on entry points. From v2.1.0 on, this will be the recommended best practice for managing models and custom pipeline component dependencies.
For now, here are three main solutions:
Disable the custom components during serialization:
with nlp.disable_pipes('entity1', 'entity2'):
nlp.to_disk('/path/to/model')
And add them back when you load in the model:
nlp = spacy.load('/path/to/model')
nlp.add_pipe(entity1)
nlp.add_pipe(entity2)
A factory is a function that takes the nlp
object and optional config parameters and initialises th component. You can find more details here.
from spacy.language import Language
Language.factories['entity1'] = lambda nlp, **cfg: EntityMatcher(nlp, **cfg)
__init__.py
(advanced)Models are Python packages, so when you load an installed model, spaCy will import the package and call its load()
method. All code present in the model's __init__.py
will be executed, too, and you can ship any custom code with a model. This solution requires you to package your model using the spacy package
command and editing the __init__.py
to add your component and its factory. Also note the infobox and possible caveats described here.
Thank you !!
@ines Thank you for your reply , but my issue is something different. I want the model to persist the data , i.e I dont want to add the patterns to the PhraseMatcher component everytime i load the model . I want to save the model such that i dont need to re add the patterns the next time i load it . Any way around ??
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Hi @ines @honnibal, I am sharing my code to get to know what needs to be added into this to load NLP CODE
Saving NLP
Loading NLP
ERROR while loading it back from same location i am getting following error
KeyError: "Can't find factory for 'entity1'."
Looking forward for your immediate help!!!