nicehash / NiceHashMiner

NiceHash Miner
Other
478 stars 216 forks source link

Unexpectedly hard reboot without BSOD when using NHML #643

Closed Linightz closed 5 years ago

Linightz commented 6 years ago

When running nicehash GUI miner or Lagacy miner, my mining rig with 6 GTX1060 cards would randomly and suddently hard reboot, without any BSOD message. When running other mining software like Claymore's dual miner or minergate, such thing doesnt happen. It's certain that the hardware has no issue because this issue applys to all my mining rigs (dozens or so) and happens only when running nicehash software. Do you have any idea what might be the casue? Thank you.

Just to add more info: it appears if I disable some of the mining software it uses (ex. sgminer, ethminer), the occurance of this issue will decrease (not gone), so it might has something to do with the mining softwares. In this case, is there any more advanced setting or parameter I have to set?

Linightz commented 6 years ago

log.txt Attached the log of the rig that happens the most. (3 or 4 times a day)

Linightz commented 6 years ago

log.txt log.1.txt Uploaded another log thats right after it reboots.

VadimEv commented 6 years ago

uncheck "run script when CUDA device is lost" in settings. This feature works incorrectly. Cuda device loading can be read via nvsmi and it's floating around 70-100% for certain algos. And so is restart triggered.

Linightz commented 6 years ago

Hi VadimEv, the setting is not checked on any of my rigs. Thanks.

Linightz commented 6 years ago

I doubt it's the miners now, because it still happens when it has Claymore left. This problem is killing me. Can any one help?

VadimEv commented 6 years ago

Bad news for ya, if it restarts regularly without bsod, means your PSU is not good enough (by that I mean something like corsair rm850x for your setup). What you can do - use afterburner (I really hope you allready did it) - undervolt you vgas by pushing power limit to the 60-80%, for example: I have 7x 1060 on one of my rigs pushing to 175mh on claymore dual with sia @ 100% power with 70+ grad Celsius, does it worth it? let's see... 167mh @ 60% power with no more than 55 grad Celsius. It will take sometime to get the right driver/settings. if you use retail gpu (not the miner one p106) - use latest + latest afterburner - underclock them. Also - be sure to check riser connection power cables - if one of them overheating(actually rule of thumb here is: you can hold you fingers on it for 10+ seconds, under 60, 5 secs - under 70, under 1 sec - 80, and that is the limit of operation). Claymore dual mode is by far most power-consuming algo, you literally push you cards to very limit of their operation, and this requires pretty solid psu (pref gold++ of premium segment). Also remember to set system to restart automatically (startup and recovery settings) and add miner of choice to autostart (either use inbuilt option, or scheduler, or shell:startup to add shortcut). Keep in mind that if it restarts say, 1 time in 1h - somethings wrong, 1 time in a day - is OK,

Linightz commented 6 years ago

Hi VadimEv, thanks for the reply, but here comes the weird thing: When I run Claymore dual mode the rig is totally fine, it can work a couple weeks straight without restart even once. I've also got a Watt meter to see the power consumption, it's around 900W while I have a 1200W platinum PSU.

However when I run Nicehash, the power consumption is less than Claymore dual mode (even when its running Claymore dual mode under nicehash because the -dcri is at default. I set it to 50 when I run Claymore myself).

VadimEv commented 6 years ago

ye, since the connectivity issue I use claymore myself most of time, In this case we have to summon @DillonN here, looks like 1.8.1.5 is Bag of Bugs.

spock9 commented 6 years ago

I also have the same issue. my system has only two GPUs and it was working fine with the older versions. the 1.8.1.5. looks like the cause of this shutdowns.

DillonN commented 6 years ago

@Linightz there isn't anything in your logs that looks fishy, and it looks like you're mining ClaymoreDual most of the time which should be most stable.

You mentioned disabling sgminer - I'm guessing that's for another rig (with AMD cards), but sgminer can cause this issue if you use Remote Desktop Protocol (can check the front page readme for a little more info on this). For all NVIDIA rigs though certainly that wouldn't be a problem

If the cause of the problem is GPU related you might be able to find some errors in the Windows Event viewer shortly before the shutdown

You said you were able to get very good stability when running just ClaymoreDual outside of NHML, have you tried running only that (and with the same settings) within NHML?

There was very little changed in 1.8.1.5 from 1.8.1.4, nothing that should cause this kind of problem at least. The only problem would be the upgrade of ccminer_nanashi used for Lyra2REv2. @spock9 if you do not have an issue with 1.8.1.4 you could try disabling that miner-algo combo specifically, maybe the upgrade is causing issues for you

Linightz commented 6 years ago

@DillonN for the AMD rig the system would hard reboot even when I just run the benchmark on sgminer (i only use Teamviewer so shouldnt be the RDP problem), I'll try to run only Claymore within NH with my NV rigs, thanks. log.txt Attached the log of the AMD rig that just hard reboot while benchmarking.

S74nk0 commented 5 years ago

Issue related to old nhm version/build. Please use latest version and open a new issue for bugs and/or features.