Add stale data warnings

The CVS issue in #1458 and other similar issues of API data freshness in the past have always been discovered by chance, and by us paying regular attention to the data. I was a little slow on catching CVS because the system has moved more towards pure maintenance, and that’s a problem. We need some alerts when data ceases to be fresh.

Places to check:

The loader can check all the records it outputs and log issues. I like that this is close to where things happen, and is very immediate. OTOH, it could be too sensitive. Not sure. This also lets us summarize across all the data surfaced by a given source, and at a given time (doing the checks elsewhere might wind up including locations surfaced by that source in the past, but that are no longer surfaced by that source — e.g. this might be a big problem for Prepmod, which has a lot of one-off/two-off events).
The server’s /api/edge/update endpoint could check every record it receives. We already have a freshness timeframe encoded in the server, and this puts these freshness checks in a nearby place. This is also just as immediate as the loader. OTOH, the server doesn’t know about a whole loader run, so this can’t summarize across a source (other than keeping a running average or something).
A scheduled job (probably 1–4× /day) could calculate statistics across the whole database. It lets us do lots of summary statistics, but can't tell whether an availability record is stale because the source no longer outputs info for that location. It also needs configuration or coding to know which sources to configure (since some are deprecated and no longer in use). However, it can catch issues caused by the loader never running or never surfacing any data at all for a given source, which the other approaches would miss.

What constitutes “stale?”

<record>.availability.valid_at is vastly older than <record>.availability.checked_at or the current time.
<record>.availability.slots or <record>.availability.capacity are present, but all the entries are in the past.
⚠️ Do we have at least one of these for every source currently in use?

How do we track it?

Log a warning to Sentry. This is pretty binary, and adjusting the threshold means code changes and re-deployment.
- In the server, we have a threshold of 7 days for whether records are fresh: https://github.com/usdigitalresponse/univaf/blob/26212136b89380971fdcf35251fbdef7016465da/server/src/db.ts#L28-L33
Track freshness (days/hours/seconds out of date) as a metric in Datadog. Then we can just adjust the threshold for alarms without deploying anything. We can also track different different summary stats by source (average, max, min).

Of existing, actively used sources, can we get freshness info (and how)?

njvss: Not supplied right now, but there’s a Last-Modified header on the underlying data we can use (and a TODO comment in the source about using it 🙃 ).
waDoh: Yes (valid_at)
cvsSmart: Yes (valid_at, capacity)
walgreensSmart: Yes (valid_at)
krogerSmart: (valid_at, capacity) ⚠️ looking into this revealed that there are some real problems with the dates of listed slots. I pinged devs there and at vaccines.gov on FHIR chat: https://chat.fhir.org/#narrow/stream/281612-smart.2Fscheduling-links/topic/Publisher.3A.20Kroger/near/355225002 .
albertsons: NO ❌ ⚠️ but it turns out the entire API we were using is abandoned and the data is stale; this source needs a rewrite, see #1460
hyvee: NO ❌ On the upside, we are querying a graphql API for their booking site, so can be reasonably confident it won’t be stale since it’s not a separate data source for people like us or vaccines.gov.
heb: Yes (valid_at)
cdcApi: Yes (valid_at)
riteAidScraper: Yes (slots)
riteAidSmart: Yes (capacity) Could also get valid_at from the SMART manifest’s transactionTime field, although that is what we did for Kroger, and it led us to a false sense of security.
riteAidApi: Yes (valid_at, capacity)
prepmod: Yes (valid_at, slots)

So this is mostly good except:

Wonkiness in what Kroger is surfacing signal maybe deeper issues. But we can get freshness, even if it will start triggering alerts.
No Hyvee support, but it's not one we're really worried about.
Albertson’s entirely abandoned API, which is totally fixable.

usdigitalresponse / univaf

Add stale data warnings #1459

Places to check:

What constitutes “stale?”

How do we track it?