cirruslabs / cirrus-ci-agent

Agent to execute Cirrus CI tasks
Mozilla Public License 2.0
13 stars 6 forks source link

Fail gracefully when retrieving metrics #196

Closed edigaryev closed 2 years ago

edigaryev commented 2 years ago

To avoid intermittent errors like:

Exception on the agent! Failed to retrieve resource utilization metrics: failed to query CPU usage: open /sys/fs/cgroup/cpu,cpuacct/system.slice/google-startup-scripts.service/cpuacct.usage: no such file or directory

This file indeed wasn't there for a short period of time, but for the rest of the execution everything was OK.

edigaryev commented 2 years ago

I've changed the title a bit because I've realized that we don't need to re-probe the whole metrics subsystem to fix the error above as the cgroup version (or a lack of cgroup) is usually determined at the system boot time and is unlikely to change, so simply failing gracefully should be enough.