Hi, I don't have time to make a PR right now, this is just to let you know that simply excluding NER from the spacy pipeline results in approximately 2x speed (at least when processing lots of short sentences).
You can do so by replacing line 158 of core.py from
And most likely, you could also add a separate case (like self.nlp_nosyntax = spacy.load(spacy_lang, exclude=[...])) for matching without syntax where you can exclude most other components as well and get an even larger speedup.
Hi, I don't have time to make a PR right now, this is just to let you know that simply excluding NER from the spacy pipeline results in approximately 2x speed (at least when processing lots of short sentences).
You can do so by replacing line 158 of core.py from
to
And most likely, you could also add a separate case (like
self.nlp_nosyntax = spacy.load(spacy_lang, exclude=[...])
) for matching without syntax where you can exclude most other components as well and get an even larger speedup.