Closed tsibley closed 2 weeks ago
CI failure because:
bash: line 1: inkscape: command not found
Traceback (most recent call last):
File "/home/runner/work/status/status/./favicon-[layer].svg.py", line 6, in <module>
doc = ET.parse(stdin)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 1222, in parse
tree.parse(source, parser)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 580, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
make: *** [Makefile:18: build/favicon-success.svg] Error 1
make: *** Deleting file 'build/favicon-success.svg'
…which is a whoops! on my part. I didn't mean to have the generation actually be part of the regular build, but as a way to update committed files. But I forgot that part! Will fix.
I suspect this will make us want to clean up old lingering failed runs from the status page… ;)
Would it make sense to limit the overall success/failure to the last ~24-48 hours? I've ignored the old lingering failures if the next automated run completed successfully. Seems like a waste to re-run workflows just for the ✅
Fixed up, but uh, what now GitHub?
Would it make sense to limit the overall success/failure to the last ~24-48 hours? I've ignored the old lingering failures if the next automated run completed successfully. Seems like a waste to re-run workflows just for the ✅
It's only considering the latest run for each workflow. The issue is we have workflows where the latest run is failed and quite old, because it hasn't been re-run since (somewhat by design).
Oh, maybe you meant to only consider the latest run for each workflow, iff it's within the last 24-48 hours?
Oh, maybe you meant to only consider the latest run for each workflow, iff it's within the last 24-48 hours?
Yup! For example, there's an mpox ingest failure from Sun Nov 03 that I've ignored because the subsequent days are successful
Right, but that's already ignored by the "last run" consideration. It doesn't require a 24-48 hour recency condition. I'm talking about things like this:
Maybe it's not clear that
is only considering the last run of each workflow in each repo?
Like, maybe I should have written:
const someFailure =
runsByRepoAndWorkflow
.flatMap(([, workflows]) =>
workflows.map(([, [lastRun, ]]) =>
lastRun.conclusion ?? lastRun.status))
.some(x => x === "failure");
or
const someFailure =
runsByRepoAndWorkflow
.flatMap(([, workflows]) =>
workflows.map(([, runs]) =>
runs[0].conclusion ?? runs[0].status))
.some(x => x === "failure");
Right, but that's already ignored by the "last run" consideration.
Ah gotcha, yeah, totally missed this was only considering the last run!
However, the lassa example is an interesting case...I'm pretty sure it was just a manual test run that was later fixed for the automated run.
Yeah. The lassa example will also fall off the 90 day cliff
on 24 Dec.
I will say that in general, I find the Ingest / Phylo / Ingest to Phylo pattern a bit weird (though I understand why it's that way currently).
Like, maybe I should have written…
Repushed with the second version to hopefully make it clearer.
Since merging there's been two workflow runs with the inkscape error, but other runs completed successfully.
https://github.com/nextstrain/status/actions/runs/11821865530/job/32937840090:
< favicon.svg \
inkscape --pipe --export-plain-svg --export-area-page --export-filename - \
| ./favicon-'[layer]'.svg.py success \
> favicon-success.svg
bash: line 1: inkscape: command not found
Traceback (most recent call last):
File "/home/runner/work/status/status/./favicon-[layer].svg.py", line 7, in <module>
doc = ET.parse(stdin)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 1222, in parse
tree.parse(source, parser)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 580, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
make: *** [Makefile:18: favicon-success.svg] Error 1
make: *** Deleting file 'favicon-success.svg'
https://github.com/nextstrain/status/actions/runs/11822605189/job/32940011673:
< favicon.svg \
inkscape --pipe --export-plain-svg --export-area-page --export-filename - \
| ./favicon-'[layer]'.svg.py failure \
> favicon-failure.svg
bash: line 1: inkscape: command not found
Traceback (most recent call last):
File "/home/runner/work/status/status/./favicon-[layer].svg.py", line 7, in <module>
doc = ET.parse(stdin)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 1222, in parse
tree.parse(source, parser)
File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 580, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
make: *** [Makefile:18: favicon-failure.svg] Error 1
make: *** Deleting file 'favicon-failure.svg'
There's been a mix of successful and failure runs...why does make sometimes recognize the committed favicon files but will run the favicon-%.svg
rule other times?
why does make sometimes recognize the committed favicon files but will run the favicon-%.svg rule other times?
Hmm, is there a chance that this is due to the timing/order in which the files are downloaded from the repository during actions/checkout
? If favicon.svg
was downloaded one second after the favicon-*.svg
files, then make sees the favicon-*.svg
file as outdated and runs the rule.
Should we update the make command to ensure that the favicon-%.svg
rule never runs during CI?
make -o favicon.svg -o 'favicon-[layer].svg.py'
Ugh. Maybe? Good spotting.
Useful for seeing the state of things at a glance. Inspired by GitHub's similar behaviour.
For now, the "success" indicator doesn't mean nothing's amiss (e.g. that recent runs have completed, that nothing's been cancelled, etc.), just that nothing's failed. We may want to make this more nuanced in the future, but that requires additional favicon states that I'm not inclined to produce right now.
favicon.svg is now a (Inkscape) source file that's no longer copied as-is but instead used to produce multiple build/favicon-*.svg outputs by selective inclusion of layers. The sparse index page gets a "?" indicator in the favicon to suggest "status" without suggesting "all's well" or "something's wrong". There's probably some better indicator here, but it seems good enough for now.
I suspect this will make us want to clean up old lingering failed runs from the status page… ;)
Checklist