Open pafonta opened 3 years ago
However, we might want or need to distribute our models in a packaged form.
Currently a spacy
pipeline is loaded with a very easy spacy.load()
— and this also include the EntityRuler
component.
Unless at some point we should have registered functions, is there really an strong benefit from having a model that is pip install
able?
this also include the EntityRuler component
There are 2 pipelines for each modelX
. One in data_and_models/models/ner/
. One in data_and_models/models/ner_er/
. So the EntityRuler
is loaded only if one uses the 2nd directory with spacy.load()
. Just to clarify that having the EntityRuler
loaded is another discussion than packaging the model or not. Or had you something else in mind?
Unless at some point we should have registered functions
That's indeed a case where packaging models would be handy.
is there really an strong benefit from having a model that is pip installable?
I think about 4 benefits:
Just a note: custom architectures can be distributed as python packages that plug into spacy via entrypoints.
Documentation: https://spacy.io/usage/saving-loading#entry-points-components
Example: spacy-transformers, see these lines
🚀 Feature
Package the NER models we trained.
Motivation
Make the NER models
pip
installable and easily distributable.Pitch
As we track the models with DVC, we could retrieve them if needed.
However, we might want or need to distribute our models in a packaged form.
Besides, packaging a model would let us distribute with it registered functions and custom components (
EntityRuler
?).This issue is a reminder to have this discussion.
Additional context
Reference: https://spacy.io/api/cli#package.