Closed sventor closed 3 years ago
Hi
Yes, I can have a look this week, the second error look like a problem with the parsing regex, I will need your ibqueryerrors
output to check the syntax.
Hello, I just tested this exporter and was seeing the same error. I tracked it down to the regex on line 228 and one of the ports in our config set to FDR10.
Ex:
Link info: 90 33[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 0xe41d2d030083b980 1 33[ ] "RemSwitchName" ( )
Link info: 90 35[ ] ==( 4X 10.0 Gbps (FDR10) Active/ LinkUp)==> 0xe41d2d030083b980 1 35[ ] "RemSwitchName" ( Could be 14.0625 Gbps)
The regex in the exporter code is looking specifically for the phrase "Gbps Active" so it fails to match the second example line above for me.
I hope this helps, --Roy
Hi, I modified a tiny bit the regex in commit 48e9b58, it should ignore the "(FDR10)" from the string. If this fix works for your systems, I will merge it.
Yes that works for me. I am able to start the exporter with that change.
--Roy
Dear Simon,
your tool is greatly appreciated since the idea to gather all IB statistics from one place (instead of from a thousand separate "node-exporter" instances on all compute nodes) is the best, seeing that all IB fabric counters are accessible from just one node.
However,
infiniband-exporter.py
fails at interpreting ouribqueryerrors
output. Firstly, it refused to work without a--node-name-map
file (even though our nodes are explicitely named inibqueryerrors
output):Creating such node-name-map file and running again yields, secondly, another error:
I tried to fix it myself, but being in no way a python programmer, just an administrator: could you find the time to have a look at that and possibly fix the problem...?
If you want, I can send you an example output of our fabric. Thanks!