646e62 / case-brief

Generates a FIRAC-style case brief from a reported decision
GNU General Public License v3.0
2 stars 0 forks source link

Analytic tools running slowly #34

Closed 646e62 closed 1 year ago

646e62 commented 1 year ago

Tests during the PDM code meeting this morning showed that one of the nlp() calls in analytic_tools.retrieve_citations() was slowing down function calls. Specifically, the "minimal" dictionary trained to detect neutral/CanLII citations, statutes, and section numbers accounted for 75%+ of the function's computing time. The text classification function was running surprisingly fast.

This problem may be fixable in one or more ways:

  1. Reducing the amount of data that gets loaded into the function;
  2. Using a NER model rather than a span model (NER seems to train and respond faster than spans); or
  3. Substituting some ML tasks with regex.
bbelderbos commented 1 year ago

Cool, good we did that initial investigation.