Open devinmatte opened 4 years ago
We had something similar happen. Do you have SELinux on? If so, the DD System Probe doesnt work out of the box with SELinux. We had to make a policy to let the System Probe service get past SELinux to make it work.
Yes we have SELinux on. What policy was that? I also find it curious that RHEL8 doesn't have this issue despite also having SELinux on
I'm seeing this as well, using the stock datadog-agent image.
Using the audit2allow
tool and some manual editing, we got this policy working for us.
module spc_bpf_allow 1.0;
require {
type spc_t;
class bpf {map_create map_read map_write prog_load prog_run};
}
#============= spc_t ==============
allow spc_t self:bpf { map_create map_read map_write prog_load prog_run };
Also, there are certain syscalls needed for the systemprobe container as well, make sure you are using the latest Helm chart.
Edit: Added note about syscalls and a more restrictive selinux policy.
@devinmatte @adamf Are you still seeing this issue? We do have SELinux policies available if you aren't using our Helm chart. https://docs.datadoghq.com/network_performance_monitoring/installation/?tab=agent#selinux-enabled-systems
@devinmatte @adamf Are you still seeing this issue? We do have SELinux policies available if you aren't using our Helm chart. https://docs.datadoghq.com/network_performance_monitoring/installation/?tab=agent#selinux-enabled-systems
We've been seeing a similar issue how would you suggest addressing this if we are using the Helm chart?
@bbensky I'd make sure you are using the latest version of the chart. If that doesn't fix it, please post more details so we can dig in.
Thanks @brycekahle . I am using the newest version of the chart but still getting the same error at container start up. However one of our networking folks is working on setting up an SELinux policy to make this work.
I just tried to reproduce this issue on OpenShift 3.11 and I confirm that, on RHEL 7.7, the system-probe
running with the spc_t
SELinux type isn’t allowed to perform eBPF operations.
I managed to get it fixed with the same policy as @kylegoch.
$ cat >allow_spc_bpf.te <<EOF
module allow_spc_bpf 1.0;
require {
type spc_t;
class bpf { map_create map_read map_write prog_load prog_run };
}
#============= spc_t ==============
allow spc_t self:bpf { map_create map_read map_write prog_load prog_run };
EOF
$ checkmodule -M -m -o allow_spc_bpf.mod allow_spc_bpf.te
$ semodule_package -o allow_spc_bpf.pp -m allow_spc_bpf.mod
$ semodule -i allow_spc_bpf.pp
After this, the system-probe
container could start properly on RHEL 7.7.
The disabling selinux solution works for me.
Output of the info page (if this is a bug)
Describe what happened:
datadog-agent-sysprobe
enters a failed state with this in the logs:Describe what you expected: Expected sysprobe to start running and collecting data
Steps to reproduce the issue: Attempt to start
datadog-agent-sysprobe
viasystemctl start datadog-agent-sysprobe
Additional environment details (Operating System, Cloud provider, etc): Issue occurring on RHEL 7.x (Both 7.7 and 7.6)