Closed simondeziel closed 2 years ago
At the very least we need to improve the quality of the errors here.
@tomp, it happened again (no surprise) but I thought this was worth capturing:
# I see the problematic: msg="Failed to get disk stats" err="unexpected EOF"
root@xeon:~# grep -n . /sys/fs/cgroup/lxc.payload.*/io.stat
/sys/fs/cgroup/lxc.payload.metrics/io.stat:1:8:0 8:16 rbytes=352256 wbytes=0 rios=10 wios=0 dbytes=0 dios=0
/sys/fs/cgroup/lxc.payload.metrics/io.stat:2:7:1 rbytes=206848 wbytes=0 rios=10 wios=0 dbytes=0 dios=0
/sys/fs/cgroup/lxc.payload.metrics/io.stat:3:7:2 rbytes=2048 wbytes=0 rios=1 wios=0 dbytes=0 dios=0
root@xeon:~# lxc restart metrics
root@xeon:~# sleep 30
root@xeon:~# grep -n . /sys/fs/cgroup/lxc.payload.*/io.stat
root@xeon:~#
Required information
root@xeon:~# snap list lxd Name Version Rev Tracking Publisher Notes lxd 5.0.0-b0287c1 22923 5.0/stable/… canonical✓ -
Aug 3 07:00:07 xeon lxd.daemon[1313]: time="2022-08-03T07:00:07Z" level=warning msg="Failed to get disk stats" err="unexpected EOF" instance=metrics instanceType=container project=default Aug 3 07:00:22 xeon lxd.daemon[1313]: time="2022-08-03T07:00:22Z" level=warning msg="Failed to get disk stats" err="unexpected EOF" instance=metrics instanceType=container project=default Aug 3 07:00:37 xeon lxd.daemon[1313]: time="2022-08-03T07:00:37Z" level=warning msg="Failed to get disk stats" err="unexpected EOF" instance=metrics instanceType=container project=default ...
root@xeon:~# lxc config show --expanded metrics architecture: x86_64 config: image.architecture: amd64 image.description: Ubuntu focal amd64 (20220113_07:42) image.os: Ubuntu image.release: focal image.serial: "20220113_07:42" image.type: squashfs image.variant: default limits.cpu.allowance: 100% limits.memory: 512MiB limits.processes: "500" security.devlxd: "false" security.idmap.isolated: "true" security.nesting: "true" security.privileged: "false" security.protection.delete: "true" security.syscalls.deny_compat: "true" snapshots.expiry: 3d snapshots.schedule: '@daily, @startup' volatile.base_image: cd37dfe79d6edd4ab36943f5ca4226d47280285772bb457b622bbcec92fe350f volatile.cloud-init.instance-id: 9220ad48-38c7-42de-93e0-a9fd21046d1c volatile.eth0.host_name: vethba646770 volatile.eth0.hwaddr: 00:16:3e:bb:f5:f6 volatile.eth0.name: eth0 volatile.idmap.base: "1131072" volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1131072,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1131072,"Nsid":0,"Maprange":65536}]' volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1131072,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1131072,"Nsid":0,"Maprange":65536}]' volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1131072,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":1131072,"Nsid":0,"Maprange":65536}]' volatile.last_state.power: RUNNING volatile.uuid: 7760acd1-e480-483e-949d-95b4d43cdd2d devices: eth0: network: int type: nic prometheus: path: /var/snap/prometheus/common/ pool: default source: prometheus type: disk root: path: / pool: default size: 4GiB type: disk ephemeral: false profiles:
That container is one of many on that server and other containers also have volumes attached to them:
The only unusual thing about
metrics
is that it runssnapd
and has some snaps installed inside it.In the above, the
sda
,sdb
andsdc
devices are leaked from the host :/Comparing the
cgroup
files formetrics
with those of another container (squid
), we see thatio.stat
is populated only formetrics
:Only
metrics
has content in itsio.stat
file: