uoregon-libraries / newspaper-curation-app

Suite of front- and back-end tools for the curation of digitized newspaper materials
Apache License 2.0
8 stars 1 forks source link

Reduce hits to production servers #342

Open jechols opened 1 month ago

jechols commented 1 month ago

NCA is checking the batches.json endpoint, and then all individual batch JSON endpoints, every five minutes. This was meant to ensure newly-loaded batches are seen "real time", but given #310, this isn't actually that useful a thing to have anyway.

If that were the only problem, this wouldn't be a big deal, but it seems that on some systems, it's blasting a ton of DNS requests, making the process exceptionally slow, and in one case actually causing external DNS servers to block us temporarily. (This doesn't affect production to my knowledge, because the lookup is using local DNS servers, but in dev it can be a nightmare)


A better approach is probably to leave all web issue caching to the once-a-week process that currently does its full refresh of all issue data. Then add something to the automation pipelines that adds/removes items from the cache when an automated job is successful in production. It'll get us a less brittle cache, and should still mirror what's been loaded or purged.

The tricky part is building something into the issue finder / scanner nonsense which allows us to manually modify the web cache.