Please add a WATCHDOG function so miner can alert when error occurs.

ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support

GNU General Public License v3.0

5.96k stars 2.28k forks source link

Please add a WATCHDOG function so miner can alert when error occurs. #274

Closed fynxer closed 6 years ago

fynxer commented 7 years ago

When something happens to ethminer we need a WATCHDOG that can alert and/or restart miner.

Thx to all you guys working on and developing Ethminer, i really appreciate it.

ddobreff commented 7 years ago

For AMD there is still no watchdog implemented, I am working on one but its linux only.

AndreaLanfranchi commented 7 years ago

See here #97

piotr-dobrogost commented 7 years ago

On linux the ideal and fairly easy to add solution would be to implement systemd's sd_notify interface – see systemd for Administrators, Part XV

First of all, to make software watchdog-supervisable it needs to be patched to send out "I am alive" signals in regular intervals in its event loop. Patching this is relatively easy. First, a daemon needs to read the WATCHDOG_USEC= environment variable. If it is set, it will contain the watchdog interval in usec formatted as ASCII text string, as it is configured for the service. The daemon should then issue sd_notify("WATCHDOG=1") calls every half of that interval. A daemon patched this way should transparently support watchdog functionality by checking whether the environment variable is set and honouring the value it is set to.

DDDanny commented 7 years ago

On Windows you can easily perl and ps a solution... the scripts are quite small and will restart ethminer once you recognize a (CUDA) error via perl log parser or a dead process via ps command.

ykhuat commented 6 years ago

Maybe you can have a look here. Auto restart ethminer if no job for 5 min time. Auto restart system if "CUDA ERROR" detected. https://bitcointalk.org/index.php?topic=2195527.0

DeadManWalkingTO commented 6 years ago

After #757 (added --exit parameter to exit whenever an error occurred) you can use a watchdog.

Here is my ETHminerWatchDogDmW Windows7/8/10 [32/64] & Linux (Any Dist/Any Ver/Any Arch) (#735).

Check and feedback please. Thank you!

th0ma7 commented 6 years ago

If that can be of interest to anyone, I created one for linux using AMD video cards. Probably not too hard to adjust for CUDA crashes if one wants to contribute. https://github.com/th0ma7/th0ma7