Closed phip123 closed 9 months ago
Hi @phip123,
We have a script compatible with Nagios queries. The output of the nagios script is JSON plus some Nagios flavor text. Running xbutil examine -d <BDF> -r <Report names> -o <Output file>
will create a JSON with the specified reports.
If using Nagios is out of the question you can invoke xbutil
directly.
xbtop
provides the electrical, memory, and dynamic region reports. You could run xbutil examine -d <BDF> -r memory electrical dynamic-regions -o <Output file>
and send the output JSON to where it is needed.
Is this what you are looking for? We are always looking on ways to improve!
Hi,
thanks for the comment. Sadly, I'm going to stick with Prometheus and I'm currently working on an exporter that uses xbutil2.
I'm using the xbutil command you suggested - is there any way to direct the JSON output to stdout instead of a file? Because currently it seems there is no other way than letting xbutil write the JSON file to disk and then read the JSON file, which causes quite some overhead...
Thanks!
Hi @phip123,
No worries. Thanks for mentioning Prometheus, looks interesting.
Unfortunately we cannot. The Nagios plugin functions by creating a temporary file and then deleting it. We can get very close by running xbutil examine -d <BDF> -r memory electrical dynamic-regions -o /dev/stdout --force
, but, it includes the human readable report which is unfortunate. I'll see if I can add an option to silence the human readable output. Seems like it could be useful.
Hi @dbenusov-xilinx,
then I'm following the same approach as Nagios, by creating and deleting a temporary file. Thanks, I think that would be nice!
Another, somewhat related question, is it possible that the xbutil command locks some resources? For example on instances with multiple FPGAs installed it seems there is some locking going on and even executing the xbutil command in parallel takes longer than individually.
Hi @phip123 Sorry for the delay. The user space code does not use mutexes for any reading operations. It does use mutexes to access the device table which contains device handles, although I doubt that would cause much of a delay. The driver also does have per device locks, but, if different devices are accessed this should not be an issue. From what I gathered, if the requests are done in parallel, it seems the CPU could be the bottleneck.
I had some time to finally talk to the team about adding the silence feature and it seems like it is more trouble that it is worth considering the uses. If you really need that feature, please open another issue and I will push it further. Thank you for your feedback!
Hi,
is there any tool/library that provides the output of xbtop in a machine readable format? I sadly have troubles with the "build with docker" script, otherwise I would simply change the source code.
I figure there must be some sort of monitoring support that is cloud-native (or similar).
Thanks!