intel / intel-cmt-cat

User space software for Intel(R) Resource Director Technology
http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
Other
692 stars 182 forks source link

[seek for advice] how to close pqos cleanly as non-root user #198

Closed guoqiao closed 2 years ago

guoqiao commented 2 years ago

This issue is created to seek for advice from pqos official team.

Our team is trying to use [telegraf intel RDT plugin] (https://github.com/influxdata/telegraf/blob/master/plugins/inputs/intel_rdt/README.md) on ubuntu servers in production.

pqos must run as root, but telegraf is running on ubuntu as non-root user telegraf. We are able to let telegraf start pqos with a sudoer file rule, e,g:

cat /etc/sudoers.d/telegraf_intel_rdt
Cmnd_Alias PQOS = /usr/sbin/pqos -r --iface-os --mon-file-type=csv --mon-interval=*
telegraf ALL=(root) NOPASSWD: PQOS
Defaults!PQOS !logfile, !syslog, !pam_session

However, we have trouble to let telegraf shutdown pqos when service telelegraf stop or RDT plugin disabled. We had discussions in telegraf #9527 and telegraf #9501 but didn't get a solution yet.

Example trouble we have:

in telegraf#9527 our solution was to kill pqos in telegraf rdt plugin golang code as such:

cmd := exec.Command("sudo", "killall", "-SIGINT", "pqos")

which introduces multiple issues:

Any advice from your team will be appreciated. Thank you.

kmabbasi commented 2 years ago

You can create link of pqos for telegraph e.g. ln -s pqos pqos-telegraph

This way killing pqos-telegraph would only kill pqos processes that belong to telegraph

Thanks, Khawar

guoqiao commented 2 years ago

@kmabbasi We have discovered that the telegraf stop timeout issue was actually caused by a logging bug in its intel_rdt.go plugin: https://github.com/influxdata/telegraf/issues/9844

with that fixed, pqos can be closed by systemd without waiting.

thanks for the adivce, I will close this issue.