nicehash / NiceHashMiner

NiceHash Miner
Other
483 stars 217 forks source link

Pressing "STOP" crashes display driver on some algorithms #74

Closed Frageron closed 5 years ago

Frageron commented 7 years ago

I'm running NHML 1.8.0.2 on Windows 7 x64, 5х1070 GPU, driver 384.76.

When I press "STOP" or when NHML switch from one of this algorithms: Keccak, Nist5, NeoScript, DaggerHashimoto (any dual) the display driver crashes ("Display driver stopped responding and has recovered") and i have to restart my system to put it to work again.

Now i'm runnig only Lbry, Equihash (EWBF) algorithms, they work perfectly (switching from one to another).

This problem chasing me from NiceHashMiner 1.7.5.13 (NHML 1.8.0.0, NHML 1.8.0.1). Tried changing drivers for older ones, but it does not helped.

p1r473 commented 7 years ago

My system freezes as well when I press stop

Frageron commented 7 years ago

My system does not freeze, just losing 1 (sometimes 2) GPU, so others can't mine (giving errors on algorithms). Only system restart can put them all to work again.

DillonN commented 7 years ago

I've had this problem with some miners before, usually after an update to drivers or the miner. For me it would happen even when running the miners manually so there wasn't anything I could do to NHML to help other than disable them, is it the same for you? If you have any overclocks/undervolts trying without that might help as well

Frageron commented 7 years ago

I have tried without overclock, it's the same. Thank you anyway.

lmlim commented 7 years ago

undervolting usually had the same problem. i have to zero out undervolting before stopping.

MonaxDK commented 7 years ago

I occasionally have this problem, this is due to the fact that during mining process the memory frequency is reduced below the nominal value (NVIDIA). Therefore in MSI Afterburner it is necessary to increase frequency of memory to reach even nominal frequency. When you press the stop button, the overclocked cards do not have time to lower the frequencies and for a while (1-2 seconds) they work at very high frequencies in the 2D mode. As far as I understand, this is not an NHM issue. A possible solution may be to somehow force the cards to reduce frequencies to 2D values, before the full stop of the miner.

sebeksd commented 7 years ago

Similar issue here, this is my first rig and the problem exist from very beginning so I was thinking that it is hardware issue but now I think it is not. When I enable all algos, my rig work from 15min to 2h (time depends on switching frequency). When I only enable EWBF my rig mine 24h without any problem. Moreover I did test config with Keccak and EWBF enabled and I was looking on monitor when switching from Keccak to EWBF occurred and exactly when EWBF window showed MSI Afterburnet reported that GPU connection lost. I also did test Keccak mining only, 24h without any problem. Stock clocks or OC seems to have no effect, problem frequency is similar.

My rig contains 3x Aorus 1080Ti on newest drivers (384.94) and Windows 7 x64. Problem also occurred on Win 10 and it was more frequent but I think that now frequency decreased because of fix 4e04b28

Sav87 commented 7 years ago

It's all about the wrong way to stop the miner (kill) instead of Ctrl + C

sebeksd commented 7 years ago

So I was able to pinpoint my problem, it took me two weeks, many tests with also invlolve NHML source code changes. Also like @MonaxDK said I did test presumed problem with memory clocks staying high while rest of clocks and voltages drop. I did test this by enabling Kboost so clocks stayed high all the time, but this didn't help.

It seems that problem is only related to Keccak algo/miner. When I disable this algo my rig was working more than 24h and it did switch algos multiply times. Because all problems seems to be relatet to algo switching I made a test case to check if one algo is stable. Test is very easy and fast to make and its results are consistent. Test looks like this: set only one algo on all GPUs (in my case 3x Keccak), press start in NHML and wait until Speed is different then 0 (in NHML not in miner window) then press stop, wait untill miner window close, press start again, repeat that few times (e.g. 10). In my case Keccak causes my GPU to lost connection after 2-4 start/stop. Lyra2REv2 and Lbry seems to be stable on my rig.

Sav87 commented 7 years ago

Нашел возможное решение проблемы в коде функции bool signalCtrl(uint thisConsoleId, uint dwProcessId, CtrlTypes dwCtrlEvent).

rapra commented 7 years ago

Hi, I had the same issue - accidential driver crash on changin miners and on stop button, also when I close console window by x button too. After I forced P0 state on my GTX 1060 with tool nvidiaProfileInspector 2.1.3.6 (section 5, disable force P2 state for CUDA), mem clock stopped drop by 200MHz on miner start. But after that driver crashed too, even adter declocking gpu/mem, decreasing tdp etc. Today I switched my gpus - from GTX1060 first (main pci-ex-16 slot), AMD firepro second (riser), to AMD first, GTX second, when nvidia crashed again I also start see error messages on clatmore miner gpu monitor lines - it stopped to show fan speed. I use to overclock afterburner which freezes on nvidia crashes (after a lot of trying on freezed desktop if I could see afterburner window I see no voltage detected), today I also see info 'connection with gpu lost'. SOLVING: So, after reboot when afterburner started and set clocks/tdp etc. I closed afterburner (fan control stopped, but all other settings stay set by afterburner), and after that driver stopped crash. Now I can overclock my gpu up to +525 mem without crashes (-400 gpu, +525 mem, 60% tdp, 70 deg temp limit) - on claymore it givin stable 23.8Mh/s for hours. Claymore miner shows 59 deg, 29% fan. I stopped to use nicehash also, because new versions have gpu monitoring added, so I wont risk next crashes - now system is without any monitoring nor clocks control except monitoring in claymore miner. RECEIPT: If you set clocks for gpus, close program which sets clocks, then start mining. It could help with crashes

S74nk0 commented 5 years ago

Issue related to old nhm version/build. Please use latest version and open a new issue for any bugs or feature requests.