Tests during the PDM code meeting this morning showed that one of the nlp() calls in analytic_tools.retrieve_citations() was slowing down function calls. Specifically, the "minimal" dictionary trained to detect neutral/CanLII citations, statutes, and section numbers accounted for 75%+ of the function's computing time. The text classification function was running surprisingly fast.
This problem may be fixable in one or more ways:
Reducing the amount of data that gets loaded into the function;
Using a NER model rather than a span model (NER seems to train and respond faster than spans); or
Tests during the PDM code meeting this morning showed that one of the nlp() calls in analytic_tools.retrieve_citations() was slowing down function calls. Specifically, the "minimal" dictionary trained to detect neutral/CanLII citations, statutes, and section numbers accounted for 75%+ of the function's computing time. The text classification function was running surprisingly fast.
This problem may be fixable in one or more ways: