distributed-system-analysis / pbench

A benchmarking and performance analysis framework
http://distributed-system-analysis.github.io/pbench/
GNU General Public License v3.0
188 stars 108 forks source link

Run pbench other than root user #89

Open akrzos opened 9 years ago

akrzos commented 9 years ago

While working on getting pbench to run from a user which doesn't exist on other remote nodes, I was able to get a tool to register however pbench now displays there is an additional host named "root".

Ideally, I could set what user to run the tool under.

[stack@manager ~]$ register-tool --name=mpstat --remote=root@192.0.2.12
[root@192.0.2.12]Package pbench-sysstat-11.1.2-32.el7.centos.x86_64 already installed and latest version
[root@192.0.2.12]mpstat tool is now registered in group default
[stack@manager ~]$ list-tools
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
ssh: Could not resolve hostname root: Name or service not known
default: root[],192.0.2.12[],192.0.2.11[]
akrzos commented 9 years ago

As a work around you can setup the user ssh's config file:

/home/stack/.ssh/config

hostname 192.0.2.12
user root

This forces the stack user to be root on the remote machine in the above example which allows tools to be correctly registered. Make sure you set permissions correctly on .ssh/config

ndokos commented 9 years ago

Can you post your tools.$group file after you ran 'register-tool'?

ndokos commented 9 years ago

Some more errors from akrzos:

After Rally task start. /opt/pbench-agent/util-scripts/sysinfo-dump: line 36: pushd: /var/log/libvirt: Permission denied /opt/pbench-agent/util-scripts/sysinfo-dump: line 38: pushd: /etc/libvirt: Permission denied tar: overcloud-controller-0/block-params.log: time stamp 2015-10-06 13:51:56 is 854.132324512 s in the future tar: overcloud-controller-0/libvirt/log: time stamp 2015-10-06 13:51:56 is 854.131651662 s in the future

akrzos commented 9 years ago

The time in the future is a non-issue, apparently OSPd doesn't have ntp setup and that is fixed. However the permissions issue I can't seem to solve without breaking other things.

I tried to run sudo user-benchmark ... this is met with user-benchmark not found, in which case you must edit your sudoers file and change the secure_path line to match: Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/opt/pbench-agent/util-scripts:/opt/pbench-agent/bench-scripts

Now the issue becomes my user-benchmark script really needs to run as user stack and source the correct environment, thus the benchmark can't be run now. I might have to resort to start/stop tool scripts since I can sudo those without impacting the actual benchmarking tool (Rally) then.

ndokos commented 9 years ago

Try something like this:

sudo -u stack bash -c 'source /etc/profile.d/pbench-agent.sh; user-benchmark ....'

You may have to add the uid of the invoker to the appropriate group (usually "wheel" on RH/Fedora/CentOS).

Does this work for you?

ndokos commented 8 years ago

This is from some mail that I exchanced with Alex, added here for future reference:

Alex has been trying to do this and has been running into problems. The basic ones are:

o ownership of /var/lib/pbench-agent - that's where the results directory is created.

o ownership of /opt/pbench-agent/id_rsa - that needs to be readable in order for move-results to succeed - the way it's installed, it's owned by pbench.pbench with mode 600. I think it needs to be 600, so the only solution seems to be to change the owner or use multiple sudo invocations: an outer one to user "pbench" to allow user-benchmark to get at the key file and an inner one to user "stack" (e.g.) to run the script.

o collect-sysinfo problems - I get the following:

,---- | + collect-sysinfo --group=default | --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_12:52:24 | end | Collecting system information | /opt/pbench-agent/util-scripts/sysinfo-dump: line 36: pushd: /var/log/libvirt: Permission denied | /opt/pbench-agent/util-scripts/sysinfo-dump: line 38: pushd: /etc/libvirt: Permission denied `----

There are a couple of methods we could use to address these problems:

o per-user config file in e.g. ~/.config/pbench-agent/pbench-agent.conf

That can override /var/lib/pbench-agent as the default run directory, resolving the first problem.

o Add the user to the pbench group - that would resolve the second problem.

o Run collect-sysinfo under sudo - that would require that mods be made to /etc/sudoers (e.g. allowing %wheel no-password sudo privileges and adding the user to the "wheel" group).

As an experiment, I let the wheel group do anything in /etc/sudoers, added the pbench user (in addition to my own user id) to the "wheel" group and added a "stack" user with no special privileges.

With those (or similar) modifications, I can run the following script[fn:1]:

#! /bin/bash

sudo -u stack bash -c 'echo $PATH; id; sleep 10'

under a somewhat modified user-benchmark with the following output (I've annotated part of the output using the markup '##### here is an annotation'):

#### running user-benchmark under user pbench solves the id_rsa ownership problem here.

#### Note also that sudo resets the path, so we need to source /etc/profile.d/pbench-agent.sh explicitly.

$ sudo -u pbench bash -c '. /etc/profile.d/pbench-agent.sh; user-benchmark --config=akrzos -- /tmp/my-user-benchmark-script'

+ id      #### added to user-benchmark for debugging purposes - running as pbench.
uid=1001(pbench) gid=1001(pbench) groups=1001(pbench),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

+ export benchmark config

#### in this case, I chowned /var/lib/pbench-agent to pbench.pbench, but config file is probably better.
+ metadata-log --group=default --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31 beg
+ start-tools --group=default --iteration=1 --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31/1/reference-result
+ echo 'Running /tmp/my-user-benchmark-script'
Running /tmp/my-user-benchmark-script
+ log '[user-benchmark] Running /tmp/my-user-benchmark-script'
++ timestamp
+++ date +%Y%m%d_%H:%M:%S.%N
++ echo 20151007_13:03:32.145484021
+ debug_date=20151007_13:03:32.145484021
+ echo '[info][20151007_13:03:32.145484021] [user-benchmark] Running /tmp/my-user-benchmark-script'
+ /tmp/my-user-benchmark-script
+ tee /var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31/1/reference-result/result.txt

#### the sudo inside the script resets the PATH - the script has to fix that if necessary
/sbin:/bin:/usr/sbin:/usr/bin
#### the sudo -u stack inside the script is effective
uid=1002(stack) gid=1002(stack) groups=1002(stack) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

+ stop-tools --group=default --iteration=1 --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31/1/reference-result
/opt/pbench-agent/tool-scripts/perf: line 139: kill: (18265) - No such process
+ postprocess-tools --group=default --iteration=1 --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31/1/reference-result

#### and the sudo here is a mod to user-benchmark - needs to be careful with quoting, but it is effective.
+ sudo bash -c 'source /etc/profile.d/pbench-agent.sh; collect-sysinfo --group=default --dir=/var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31 end'
Collecting system information

+ rmdir /var/lib/pbench-agent/user-benchmark_akrzos_2015-10-07_13:03:31/1/reference-result/.running

So it looks possible and without too many modifications.

Thoughts?

Footnotes:

[fn:1] I could not get the quoting right - that's why I created the script.

ndokos commented 8 years ago

Currently, the private key file is installed with mode 640 and ownership pbench.pbench. If you add a user to the pbench group, then that user can read the key file. This does not work for the pbench user itself though:

Permissions 0640 for '/opt/pbench-agent/id_rsa' are too open.

It does work with any other user in the pbench group.

ndokos commented 8 years ago

Issue #300 (opened by mistake and now closed) is the same issue.

ashishkamra commented 7 years ago

need further investigation