Open victorlin opened 2 weeks ago
This seems like a transient network error
FWIW, I did see these types of errors occasionally (locally) while I was working on correcting links across the various repos.
Thanks for creating the issue; if this happens frequently, I'll handle the split/continue-on-error changes.
Documenting another occurrence:
(installation/installation: line 9) broken http://www.microbesonline.org/fasttree/ - 403 Client Error: Forbidden for url: http://www.microbesonline.org/fasttree/
(releases/changelog: line 646) broken https://github.com/nextstrain/augur/pull/1033 - 502 Server Error: Bad Gateway for url: https://github.com/nextstrain/augur/pull/1033
(releases/changelog: line 642) broken https://github.com/nextstrain/augur/pull/1034 - 502 Server Error: Bad Gateway for url: https://github.com/nextstrain/augur/pull/1034
(releases/changelog: line 626) ok https://github.com/nextstrain/augur/pull/1070
(releases/changelog: line 598) broken https://github.com/nextstrain/augur/pull/1039 - 502 Server Error: Bad Gateway for url: https://github.com/nextstrain/augur/pull/1039
(releases/changelog: line 643) broken https://github.com/nextstrain/augur/pull/1042 - 502 Server Error: Bad Gateway for url: https://github.com/nextstrain/augur/pull/1042
And another, twice in a row.
BOOOOOO.
I will pick this up and make it continue-on-error: true
in the next work cycle.
I'm wondering if continue-on-error: true
is the right solution here. With this setting as-is, "real" linkcheck issues are likely to go unnoticed.
On the other hand, with something like the CI failures we have currently or mainmatter/continue-on-error-comment, I'm worried that it could be unnecessarily noisy given the high rate of these failures as of lately (I've seen many in Augur, but no longer linking them back to here).
Some alternatives:
linkcheck
in CI but instead on a weekly schedule with retries + cooldown periods in between each try. This would reduce the impact of transient network failures while making sure links are valid.I realize this comment is coming a bit late but it's longer-term thinking. continue-on-error: true
should be good to reduce CI failures in the short-term.
I generally agree with @victorlin here.
I've just run into this error on an Augur PR which did not change any docs links:
This seems like a transient network error which shows up as a failing check ❌ on the PR which confused me at first.
make linkcheck
is a recent addition (#104), so it's hard to tell how often we will run into this. If it happens often, it might be worth splittinglinkcheck
into a separate job on docs-ci and usingcontinue-on-error: true
.