Closed rhystmills closed 1 year ago
~This looks good, and the two times I've published a new artefact in the manager in CODE, also invariably knocks over the checker service with java.lang.OutOfMemoryError: Java heap space
(logs) 😢~
~Wonder if reflection has an unexpected impact on our memory footprint here?~
~Another thing to try: swap out or reconfigure the cache with another cache with a shorter or nonexistent TTL. That'd avoid us having to reflect and alter the class whenever we create a new dictionary.~
EDIT: this may be due to another memory-related issue found in https://github.com/guardian/typerighter/pull/449 – disregard!
Co-authored with @jonathonherbert
What does this change?
Changes to dictionary rules (additions, edits, or unpublish actions) don't cause any user-facing changes in the DictionaryMatcher, or at least take a long time to surface if they do - for example - creating a new dictionary rules doesn't cause that word to be recognised as valid in the matches sent to Composer by the Checker service. This is due to dictionary caching behaviour in
MorfologikSpeller
, used indirectly as part of LanguageTool.This PR makes two changes:
JLanguageTool
instance created as part of a newDictionaryMatcher
is only instantiated after thecollins.dict
binary has been created (i.e. after we callnew SpellDictionaryBuilder().buildDictionary
- otherwise there's a chance that the matcher will reflect an older version of our dictionary rules corpus.LoaderCache
fromMorfologikSpeller
when aDictionaryMatcher
is instantiated. The cache is constructed quite deep in the Morfologik class hierarchy, and we don't want to recreate our own versions of all those classes to allow us to access it because we'll have to add thousands of lines to Typerighter and not benefit from maintenance to their equivalent files inLanguageTool
. Currently, the dictionary binary is cached for ten minutes, and any changes made to DictionaryRules within those ten minutes won't be reflected in the matches we send to Composer - until we re-instantiate the dictionary matcher due to an unrelated rule change after those ten minutes have elapsed. This makes local testing of DictionaryRules especially difficult.Our cache invalidation is a pretty unpleasant hack using Java reflection - circumventing the
LoadingCache
being a private property ofMorfologikSpeller
. It would be good to raise a PR in LanguageTool in the future to surface a way to invalidated the dictionary cache, and use that mechanism instead of this one, but (for now) this will help us make DictionaryRules editable and reach our KR.How to test
~/.gu/flexible-composerbackend.properties
in order to use local Typerighter Checker rather than CODE:typerighter.url=https://checker.typerighter.local.dev-gutools.co.uk/