Closed ty-elastic closed 3 months ago
From @michalpristas initial look at this problem:
uid and gid should be configurable in the agent spec but assuming nobody is using this it get effective UID and GID from agent. when agent is unpacked, all permissions are set to 0,0 so everything should be owned as root. it can be that in case agent is running under elastic-agent user permissions to unpacked (beats) files are somehow messed up as there owner of a file and user it's running as are not aligned. changing owner of unpacked installations to 0,0 seems like a proper thing to do
This issue doesn't have a Team:<team>
label.
We have another report of something similar happening with the files used for Beat lightweight modules when run on k8s, producing log messages like:
{"log.level":"error","@timestamp":"2022-10-14T17:03:00.903Z","log.logger":"registry.lightmodules","log.origin":{"file.name":"mb/lightmodules.go","file.line":147},"message":"Failed to list light metricsets for module uwsgi: getting metricsets for module 'uwsgi': loading light module 'uwsgi' definition: loading module configuration from '/usr/share/elastic-agent/data/elastic-agent-d3eb3e/install/metricbeat-8.4.2-linux-x86_64/module/uwsgi/module.yml': config file (\"/usr/share/elastic-agent/data/elastic-agent-d3eb3e/install/metricbeat-8.4.2-linux-x86_64/module/uwsgi/module.yml\") must be owned by the user identifier (uid=0) or root","service.name":"metricbeat","ecs.version":"1.6.0"}
Bumping this from 8.7 to 8.6
I worked around this by adding a postStart
hook to my daemonset and chowning osquerybeats.
@jmbass can you please provide the postStart sample you used ? We might need to provide this workaround to a customer of ours
@gizas On the daemonset/deployment template spec:
...
containers:
- name: elastic-agent
image: docker.elastic.co/beats/elastic-agent:8.4.3
lifecycle:
postStart:
exec:
command: ["/bin/bash", "-c", "chown -R root:root /usr/share/elastic-agent/data/*/install/osquerybeat-8.4.3-linux-x86_64/"]
etc...
Be wary to use the right osquerybeat version for your elastic agent in the command.
for the postStart hook, you could wildcard the path (presumably there would only be one version in that directory?). notably, this does require a restart of the elastic-agent container after osquerybeat install. I guess if you wanted to be really hacky, you could try to have this run periodically regardless of whether elastic-agent is installed.
Just doing a bit our DYI sleuthing:
Using postStart has no timing guarantees, agent will have already started potentially osquery will have also already started and died before the postStart hook is scheduled to run.
A more robust work around would be to add a new startup/entrypoint script via a configmap, to correct the file permissions before agent has the chance to start:
ConfigMap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elastic-agent-k8s-scripts
data:
pre-entrypoint.sh: |
#!/bin/sh
chown -R root:root /usr/share/elastic-agent
exec /usr/local/bin/docker-entrypoint "$@"
Now edit agents mounts to include the config map:
...
volumeMounts:
...
- name: extra-scripts
mountPath: /var/local/pre-entrypoint.sh
subPath: pre-entrypoint.sh
readOnly: true
volumes:
...
- name: extra-scripts
configMap:
name: elastic-agent-k8s-scripts
defaultMode: 0754
...
Then change the container start command:
...
containers:
- name: elastic-agent-k8s
image: docker.elastic.co/beats/elastic-agent:8.12.2
command: ['/var/local/pre-entrypoint.sh']
...
The long term fix, I think, might be for the container to be built using the owner UID:0 to match how elastic-agent will run.
Just to document some of the DM conversations:
It looks like there is some miscommunication between the teams. Some are assuming that the agent always runs under unprivileged user in k8s. The others, namely Security, require the the agent running as root. The current agent files have a mix of owners: elastic-agent and root. The *.yml files are specifically set to be owned by root. https://github.com/elastic/elastic-agent/blob/main/dev-tools/packaging/templates/docker/Dockerfile.elastic-agent.tmpl#L148
It looks like we need some consistent approach and the story for our users. And if we are to support both scenarios, this should be documented, since, as of now, the instructions on Kibana will lead to broken osquery integration.
This will be resolved by https://github.com/elastic/elastic-agent/pull/4925
Hi, I deployed Endpoint and Agent 8.4.x into the daemonset on my self-managed k8s cluster per this yaml, and then deployed osquery via Fleet Integrations.
Upon deployment, I see this error in the agent logs:
this suggests that osquery extensions must not have write permissions for non-privileged accounts.
Yet after install, if I ssh into the daemonset I see this:
elastic-agent
has write priv toosquery-extension.ext
which is triggering that error.If I
chown root:root osquery-extension.ext
in the elastic agent container in the daemonset, osquery works as expected.Seems like
osquery-extension.ext
needs to somehow be owned byroot
when installed into k8s daemonset via Agent via Integrations?