python-diamond / Diamond

Diamond is a python daemon that collects system metrics and publishes them to Graphite (and others). It is capable of collecting cpu, memory, network, i/o, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.
http://diamond.readthedocs.org/
MIT License
1.74k stars 601 forks source link

python-2.4 is not working #638

Closed bhepple closed 7 years ago

bhepple commented 7 years ago

The docs claim that python-2.4 is supported (eg on RHEL5) but diamond fails with:

ERROR: Failed to set UID/GID. 'module' object has no attribute 'initgroups'

/usr/bin/diamond:204: os.initgroups(pwd.getpwuid(uid).pw_name, gid)

shortdudey123 commented 7 years ago

We haven't tested against 2.5 in quite a while 2.5 testing was removed in https://github.com/python-diamond/Diamond/commit/a08bc489751b2580f9e2e690c26efc20dd1df0c0 over 3 yrs ago and 2.4 testing was never done in travis. If 2.4 or 2.5 is broken, then we should probably drop support as opposed to fixing.

That being said, @bhepple you are free to help work on fixing it. However, Travis no longer supports python <2.6 so it getting broken again it fairly likely.

bhepple commented 7 years ago

python26 would be fine for RHEL5 - but the python27-isms kill that too

shortdudey123 commented 7 years ago

FWIW RHEL 5 is EOL so we won't support it which python27-isms don't work on RHEL 6/7?

bhepple commented 7 years ago

Please see original post - initgroups() is the first I ran into. I should have mentioned that this is the latest release Diamond-4.0.515

... we are running python26

Cheers

Bob

shortdudey123 commented 7 years ago

I am able to start diamond master on CentOS 6 with no issues

shortdudey123 commented 7 years ago

Added the vagrant box i used for testing via https://github.com/python-diamond/Diamond/pull/643 Can you give me more info on how to replicate?

bhepple commented 7 years ago

Are you running python26? That's the default on RHEL6.

This snippet is isolated from line 204 of /usr/bin/diamond and it fails on RHEL6:

#!/usr/bin/env python
# coding=utf-8

import os
uid = 806 # our diamond user
gid = 751 # our diamond group
os.initgroups(pwd.getpwuid(uid).pw_name, gid)
$ python -V
Python 2.6.6

$ ./d
Traceback (most recent call last):
  File "./d", line 7, in <module>
    os.initgroups(pwd.getpwuid(uid).pw_name, gid)
AttributeError: 'module' object has no attribute 'initgroups'

https://docs.python.org/2/library/os.html states that:

os.initgroups(username, gid)
Call the system initgroups() to initialize the group access list with all of the groups of which the specified username is a member, plus the specified group id.

Availability: Unix.

New in version 2.7.
shortdudey123 commented 7 years ago

Oh i see, you are having diamond run under a different user (presumably with --user) and the code you mentioned does not get triggered normally since diamond is typically run as root.

What is the full command you use to start diamond?

shortdudey123 commented 7 years ago

What happens if you try this?

diff --git a/bin/diamond b/bin/diamond
index 74ef0ed6..05325271 100755
--- a/bin/diamond
+++ b/bin/diamond
@@ -201,7 +201,15 @@ def main():
             try:
                 if gid != -1 and uid != -1:
                     # Manually set the groups since they aren't set by default
-                    os.initgroups(pwd.getpwuid(uid).pw_name, gid)
+                    user = pwd.getpwuid(uid).pw_name
+
+                    # Python 2.7+
+                    if hasattr(os, 'initgroups'):
+                        os.initgroups(user, gid)
+                    # Python 2.6
+                    else:
+                        os.setgroups([e.gr_gid for e in grp.getgrall()
+                                      if user in e.gr_mem]  [gid])

                 if gid != -1 and os.getgid() != gid:
                     # Set GID

Based off of reading from https://groups.google.com/forum/#!topic/paste-users/KqZRujMcJHE

bhepple commented 7 years ago
[root@dl01aspall40v][~][1000]# /usr/bin/diamond -p /var/run/diamond.pid
WARN: Bogus pid file was found. I deleted it.
ERROR: Failed to set UID/GID. list index out of range
shortdudey123 commented 7 years ago

dang it 😞 i was afraid of that

So the list index out of range happens because the user is not in the group that you pass. Probably because the user and group are the same and linux does not show the user in its own group. Let me keep poking at it

bhepple commented 7 years ago

We run as user:group = diamond:diamond which is 806:752 on my test system

[root@dl01aspall40v][~][1004]# id diamond
uid=806(diamond) gid=752(diamond) groups=752(diamond),753(shadow) context=root:system_r:unconfined_t:s0-s0:c0.c1023
shortdudey123 commented 7 years ago

is you do a grep diamond /etc/groups what do you see?

bhepple commented 7 years ago
[root@dl01aspall40v][~][1005]# grep diamond /etc/group
diamond:x:752:
shadow:x:753:diamond

EDIT: I looked at the wrong machine

shortdudey123 commented 7 years ago

hmm looks like what i thought... diamond is not part of the diamond group let me do more poking

bhepple commented 7 years ago

No - 'diamond' is the home group of the 'diamond' user so it doesn't need a separate entry in /etc/group.

I dunno why we put diamond in the shadow group - reasons lost in the mysts of time - but I don't think it would affect the current problem. I took it out as an experiment but it came back with the same error.

bhepple commented 7 years ago

Do we need a '+' in that patch?

os.setgroups([e.gr_gid for e in grp.getgrall()
                                      if user in e.gr_mem]  + [gid])

... seems to be running without the error ... and I'm getting 'context switch' metrics on our graphana. Now I need to track down what happened to the others! Perhaps another 2.7'ism???

So if you agree, we should probably check in that patch and close this buglet off. I'd be happier if the docs were also changed to say 'python-2.6 required' instead of 2.4 !!

Thanks for the help!

shortdudey123 commented 7 years ago

Going to close this since <2.6 is not supported anymore and the main thing keeping diamond from starting in 2.6 has been solved via https://github.com/python-diamond/Diamond/pull/650

If you come across anything else that does not work in 2.6, feel free to open a new issue or PR