guardian / typerighter

Even if you’re the right typer, couldn’t hurt to use Typerighter!
Apache License 2.0
276 stars 12 forks source link

Make artefact creation and read more efficient #452

Closed rhystmills closed 1 year ago

rhystmills commented 1 year ago

What does this change?

The Checker and Rule Manager services frequently crash due to OutOfMemory errors, especially locally. Watching the service run while measuring memory usage, we have seen that memory spikes correlate with the point at which the artefact is serialised to JSON by the Rule Manager and read by the Checker.

At this point, the services potentially have quite a lot in memory - the List[CheckerRule] - containing >250,000 entries, the JsValue representation of that list, and the string representation of that JSON (~50mb in size).

This PR writes the artefact as newline-delineated JSON, iterating through the rules and writing bytes to the artefact for each individual rule, so that we don't need to have as many objects in memory at any one time, and giving the JVM garbage collector more opportunities to reclaim memory.

This should resolve the OutOfMemory errors on artefact write. Time from publishing in the Rule Manager to receiving the new match in Composer seems a lot faster - from around 14 seconds in testing, but I've been unable to benchmark against existing times because the Checker crashes too often on older branches for me to measure on those branches.

The PR should be backwards compatible, checking first for the new (newline delineated JSON) artefact and falling back to the old (standard JSON) artefact if it's not found.

(We should be able to remove this in the future and move only to the new style).

How to test

  1. It's a good idea to deploy to CODE and check the backwards compatibility. Make sure the new artefact (typerighter-rules-seq.json) is not present in the Typerighter CODE S3 bucket for a fair test of the backwards compatibility (you may need to manually delete it from S3 via the AWS console). Does the deploy complete?
  2. Once successfully deployed, create a new regex rule and publish it.
  3. Does it show up in Composer CODE against matching text in a timely manner? (It should).
  4. Does the checker service crash with an OutOfMemory error? (It shouldn't).

How can we measure success?

  1. No OutOfMemory errors on artefact publication
  2. Faster time from publication in the Rule Manager to matches appearing in Composer