vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.78k stars 1.57k forks source link

Host_metrics permission denied on /run/docker/netns #21598

Open pikeas opened 1 week ago

pikeas commented 1 week ago

A note for the community

Problem

ERROR source{component_kind="source" component_id=host component_type=host_metrics}: vector::internal_events::host_metrics: Failed to load partitions info. mount_point="/run/docker/netns/ingress_sbox" error=FFI function "statvfs" call failed: Permission denied (os error 13) error_type="reader_failed" stage="receiving" internal_log_rate_limit=true

Vector is running on the host and is in the Docker group:

$ grep vector /etc/group
adm:x:4:syslog,vector
systemd-journal:x:999:vector
docker:x:988:vector
vector:x:987:

Files in /run/docker/netns are owned by root:

$ ls -l /run/docker/netns/
total 0
-r--r--r-- 1 root root 0 Oct 21 17:42 <random ID>
-r--r--r-- 1 root root 0 Oct 21 17:42 <random ID>
-r--r--r-- 1 root root 0 Oct 21 17:42 ingress_sbox

IIUC, Vector shouldn't be trying to read these files. They are also size 0, so there's nothing inside of them to read.

Version

0.41.1

jszwedko commented 1 week ago

Can you show the output of mount on the container? It seems like /run/docker/netns/ingress_sbox is a mount point, in which case I'd expect statvfs to be called on it πŸ€”

You can see a similar warning for container filesystems, including a workaround, mentioned here: https://vector.dev/docs/reference/configuration/sources/host_metrics/#warnings . I think you could also use exclude to exclude generating filesystem statistics for that mountpoint.

pikeas commented 1 week ago

Can you show the output of mount on the container?

Vector is running directly on the host, not in a container.

jszwedko commented 6 days ago

Can you show the output of mount on the container?

Vector is running directly on the host, not in a container.

Ah, gotcha. Could you provide the output of mount from the host then?

pikeas commented 6 days ago
$ mount | grep netns
nsfs on /run/docker/netns/ingress_sbox type nsfs (rw)
nsfs on /run/docker/netns/[random id] type nsfs (rw) (a dozen of these)

This is a standard Docker Swarm cluster, so these were presumably created by Docker generally or by the Swarm agent specifically.

The files are owned by root but chmodded 444, so they are world-readable. Is Vector either trying to write to them or trying to cd into them as if they were directories?

jszwedko commented 6 days ago

Thanks! Vector, via https://github.com/heim-rs/heim, is attempting to run statvfs on those mount points to generate filesystem statistics, and failing with the permissions error you saw above. I'd suggest using the excludes rules to configure Vector not to try to generate filesystem statistics for those mountpoints. Something like:

filesystem:
  mountpoints:
    excludes:
      - /run/docker/*