Closed NixBiks closed 3 years ago
I just realized that I just have to implement a load
method in the root of my package, e.g.
from typing import Iterable
def load(vocab: bool, disable: Iterable[str], exclude: Iterable[str], config):
from spacy.lang.en import English
return English()
I have a pipeline that builds on
spacy.lang.en.English
. I replace the tokenizer and add some custom components. Nowspacy_streamlit
usesspacy.load
to load models. Is it possible to register my pipeline and be loadable viaspacy.load
?I am aware that I can do
nlp.to_disk
onspacy.lang.en.English
with my replaced tokenizer and that I can register my components usingentry_points
but I'd rather not have to donlp.to_disk
(e.g. shouldn't keep that in my git repo and it seems uneccesary!?).Another alternative is to make
spacy.lang.en.English
with my replaced tokenizer as its own language and add that toentry_points
but it feels kinda wrong and then I wouldn't be able to get the lexeme normalization table fromspacy-lookups-data
.I hope it makes sense.