Open 1234Erwan opened 2 months ago
Hey thanks a lot for your contribution! We'll review your PR ASAP :+1:
I thought about a solution like this instead of managing an index.
--- a/SOURCES/etc/xapi.d/plugins/smartctl.py
+++ b/SOURCES/etc/xapi.d/plugins/smartctl.py
@@ -14,36 +14,31 @@ def _list_disks():
disks = []
result = run_command(['smartctl', '--scan'])
for line in result['stdout'].splitlines():
- disks.append(line.split()[2])
- disks.append(line.split()[0])
+ disks.append({'name': line.split()[0], 'type': line.split()[2]})
return disks
@error_wrapped
def get_information(session, args):
results = {}
- i = 0
with OperationLocker():
disks = _list_disks()
for disk in disks:
- cmd = run_command(["smartctl", "-j", "-a", "-d", disks[i], disks[i+1]], check=False)
- results[disk] = json.loads(cmd['stdout'])
- i = i + 2
+ cmd = run_command(["smartctl", "-j", "-a", "-d", disk['type'], disk['name']], check=False)
+ results[disk['name']] = json.loads(cmd['stdout'])
return json.dumps(results)
@error_wrapped
def get_health(session, args):
results = {}
- i = 0
with OperationLocker():
disks = _list_disks()
for disk in disks:
- cmd = run_command(["smartctl", "-j", "-H", "-d", disks[i], disks[i+1]])
+ cmd = run_command(["smartctl", "-j", "-H", "-d", disk['type'], disk['name']])
json_output = json.loads(cmd['stdout'])
if json_output['smart_status']['passed']:
- results[disk] = "PASSED"
+ results[disk['name']] = "PASSED"
else:
- results[disk] = "FAILED"
- i = i + 2
+ results[disk['name']] = "FAILED"
return json.dumps(results)
How do we move on on this PR?
@
How do we move on on this PR?
I think that managing index is error prone and IMHO using the solution I posted looks better but I don't have a strong opinion so if it is ok for you we can merge it and fix it later. @1234Erwan what do you think?
Also I just tested the patch on a physical machine that has a megaraid and it fails:
', stderr: '', Traceback (most recent call last):
File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 127, in wrapper
return func(*args, **kwds)
File "/etc/xapi.d/plugins/smartctl.py", line 40, in get_health
cmd = run_command(["smartctl", "-j", "-H", "-d", disks[i], disks[i+1]])
File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 79, in run_command
raise ProcessException(code, command, stdout, stderr)
ProcessException: Command '['smartctl', '-j', '-H', '-d', 'megaraid,0', '/dev/bus/0']' failed with code: 4
In fact the issue is because the command smartctl -j -H -d megaraid,0 /dev/bus/0
returns an output:
[10:01 r620-x1 ~]# smartctl -j -H -d megaraid,0 /dev/bus/0
{
"json_format_version": [
1,
0
],
"smartctl": {
"version": [
7,
0
],
"svn_revision": "4883",
"platform_info": "x86_64-linux-4.19.0+1",
"build_info": "(local build)",
"argv": [
"smartctl",
"-j",
"-H",
"-d",
"megaraid,0",
"/dev/bus/0"
],
"messages": [
{
"string": "Warning: This result is based on an Attribute check.",
"severity": "warning"
}
],
"exit_status": 4
},
"device": {
"name": "/dev/bus/0",
"info_name": "/dev/bus/0 [megaraid_disk_00] [SAT]",
"type": "sat+megaraid,0",
"protocol": "ATA"
},
"smart_status": {
"passed": true
}
}
But the return code is not 0... it is 4. So we need to understand if the error is "normal" and if yes fix it to be able to use megaraid. Note: this error happens only with health function, the information function works well. It is probably why you didn't notice it @1234Erwan . @1234Erwan I will take some time and I will have a look to how to fix it.
In fact the exit code status means : SMART status check returned "DISK FAILING".
So I think that we need to add a better check of exit status because return values of smartctl are defined by a bitmask so we can exctract usefull information for the user.
my python skills are quite limited, what is sure is that even if my script is not clean at all it works on my hard raid
my python skills are quite limited, what is sure is that even if my script is not clean at all it works on my hard raid
When you say it works you mean also the xe host-call-plugin host-uuid=<uuid> plugin=smartctl.py fn=health
right?
No worries for the skills it already help us :+1: I will try to understand why I have the issue and we will update your PR to pass the checks. Thanks for the answer.
my python skills are quite limited, what is sure is that even if my script is not clean at all it works on my hard raid
When you say it works you mean also the
xe host-call-plugin host-uuid=<uuid> plugin=smartctl.py fn=health
right? No worries for the skills it already help us ๐ I will try to understand why I have the issue and we will update your PR to pass the checks. Thanks for the answer.
like 2 month ago i have replaced the original plugin in my xcp-ng with my last experiment and it's working on my hard RAID.
@gthvn1 Thanks for all your suggestion and for your help, i'v just implemented your idea. It works for me I'm waiting to know if it works for you too.
It's done, you've finnaly wrote the entierly of the code yourself ๐ .
Your code helps us to think about the issue and even if it changed a little bit it helped us a lot :+1: Also you confirm that your last commit is working well on your host right?
And for information I have created a branch gtn-pr-46 that has your current commit. The reason is because as it is a local branch it triggers more check (like pytest and pycodestyle) than yours. So don't worry I will use this new branch and fix the tests but your commits will be in. I will probably squash them into one commit but it will be there and your contribution will remain :)
Fine, do as you please. I don't care much for my meager contribution. I much prefer to know that smartctl will work for everyone.
Your code helps us to think about the issue and even if it changed a little bit it helped us a lot ๐ Also you confirm that your last commit is working well on your host right?
No it didn't working, as you've said earlier the exit status is not 0.
Try to add hard raid support for smartctl.py