Closed gokulkgm closed 6 years ago
Why do you not ise a monitoring tool like ganglia or nagios?
Really it is very necessary it would be, I had stayed the fans and the CPU temperature reached almost 100 degrees. I have 50 miners for which not to follow. Here, for example in EWBF miner has a wonderful monitor.
With 50 miner you should really think about https://www.centreon.com/en/solutions/centreon/ you can define own test for windows and linux systems. You can write auto repair methods. With a real monitoring tool with failover strategies you can automate all. Getting fan speed from the miner is only a workaround.
Is Centreon going to tell you if one GPU has dropped hash rate?
Miner software has an API for a reason. Virtually every one except Stak has temperature and fan speeds. It's not that hard.
That's it.
@Grimm2017 Yes you can write a test that you will be notified if the hash rate drops. Add this to the miner means that we need to add three dependencies to the miner only to get the temperature. for nvidia 'nvmnl', for amd I think it is adl and for cpu lmsensors. And even than you will not have any automitic notifications or automatic fail over. With centreon you can check the hash rate via our json api for e.g. the vega gpus and if the hash rate drops than you can unmount the fpu and add it again.
Your competitor is long since done monitoring. Worse than you? https://github.com/xmrig/xmrig-nvidia GPU mining part based on psychocrypt code used in xmr-stak-nvidia.
GPU temperature and Fan speed statistics via API will be helpful to monitor remote rigs.