There's a bit of an argument for this, in that without a configured safety guard like SCRAPE_DAY_HORIZON, we would repeatedly check broken links in perpetuity. But we do have that guard, and it would be good to re-check broken links in case they were only temporarily broken.
Currently, we bail out on a scrape if we've ever seen the URL before, regardless of if we scraped successfully. That's because we're using the unrestrictive
getMostRecentScrapeTime()
(which doesn't care if the scrape succeeded) rather than the restrictivegetMostRecentSuccessfulScrapeTime()
.There's a bit of an argument for this, in that without a configured safety guard like
SCRAPE_DAY_HORIZON
, we would repeatedly check broken links in perpetuity. But we do have that guard, and it would be good to re-check broken links in case they were only temporarily broken.This idea was surfaced in https://github.com/TechAndCheck/tech-and-check-alerts/issues/277#issuecomment-591689187 and discussed/approved in Slack.