Closed finestructure closed 4 weeks ago
Packages can be analysed without errors locally on a db dump from today.
Manually analysing packages in prod also clears them from the CHECK_MON_001
list. Unclear what made them stick.
Ok, mystery solved. There are two problems here. The alerting query is
SELECT
r.owner,
r.name AS "repository",
p.status,
p.processing_stage,
r.updated_at
FROM
repositories r
JOIN packages p ON r.package_id = p.id
WHERE
r.updated_at < now() - INTERVAL \(literal: "\(timePeriod.hours) hours")
ORDER BY
updated_at
with timePeriod.hours
set to 4. We're incorrectly looking at r.updated_at
- the repositories.updated_at
field - when we should be using p.updated_at
. Not every analysis pass updates the repository (it only does when there are repo changes) but we explicitly update the package with the processing status.
Also, 4 hours is a bit too short now. It's takes ~ 4h and 10mins to around once so we'll want to bump that to 5h for the alert.
Actually, checkMon001TimePeriod
is already set to 6
. I only had records showing up in my local testing because I incorrectly set it to 4 when running the equivalent query. The only issue is the updated_at
field.
I accidentally commit this change to my local main and we didn't have branch protection. I've enabled branch protection and will redo this change via a revert + change PR.
Ok, branch protection doesn't seem to be working, not sure why.
I've now also ticked the second box here, hoping that'll prevent pushes of main
.
We have 363 packages failing the
CHECK_MON_001
alert, with the oldestupdated_at
being a week old.All packages except one are sitting in the
ingestion
stage. They must be erroring out in analysis, preventing them from being updated.