Open leowrd opened 3 years ago
Same here. Started crashing daily a couple of days ago, now it's crashing hourly with 0.20.3, 0.20.1 and 0.19.14 as well, probably has to do something with epoch 411?
Same for me, same error, but I am with older nvidia driver version - 466.11... and the rejected percentage went crazy last few days.
This is getting weirder... For me it's crashing on X79 and B460 chipsets, but still stable on C422 and H310... If there is even a correlation to the chipset...
I changed the miner, I'm using Phoenixminer, apparently the problem has been solved
my computer screen goes black for a few seconds, it returns with this error, and soon it shows a message GPU CRASH ERROR
Can you provide the exact gpu u have? For example I have Zotac RTX 3060 OC 12GB and I managed to run it 48.5MH/s without crashes so far.
Rtx 3060 Gigabyte vision oc
Using oc core: -250. Memory: 1300
Try to increase GPU voltage +20% in AB. Had the same issue and now running for days.
I have the 470 driver on a Zotac Twin Edge OC: -207 core +1192 mem 80% TDP 20hs daily 48.5MH/s atm.
Try to increase GPU voltage +20% in AB. Had the same issue and now running for days.
OC isn't the issue here, I've been running my cards stable with the given settings for WEEKS, now it's suddenly crashing constantly.
So upon further investigation, my cards are stable in different systems, but when running three of them in the same board, one of them fails with nvlddmkm
\Device\Video12
driver crash after a random amount of time.
Here's a piece of the log:
TREX: Can't find nonce with device [ID=1, GPU #1], cuda exception in [StreamContext<struct search_results,struct Ethash::KernelLaunchTag>::synchronize, 52], CUDA_ERROR_UNKNOWN
NVML: can't get GPU #0, error code 999
(...)
ERROR: Can't start miner, cuda exception in [check_if_cuda_really_exists, 215], CUDA_ERROR_INVALID_HANDLE
What I've already tried:
Still no joy. Up next on the list to try is MOSFET cooling, the Sabertooth X79 MOSFETs run pretty hot.
Finally, after a couple of days of debugging and hairpulling I managed to figure out the solution. Turns out my issue was with the heat accummulation of MOSFETs as the board was running on its box so the underside of it wasn't getting any breeze and all the heat was trapped. Blowing it with a fan also didn't help too much - I had to rise it so the bottom got airflow. Let that be a lesson, do not choke the bottom of your board as it can slowly bake itself.
Also, sorry for the noise as in hindsight this issue doesn't seem to be related to T-Rex miner software at all.
Finally, after a couple of days of debugging and hairpulling I managed to figure out the solution. Turns out my issue was with the heat accummulation of MOSFETs as the board was running on its box so the underside of it wasn't getting any breeze and all the heat was trapped. Blowing it with a fan also didn't help too much - I had to rise it so the bottom got airflow. Let that be a lesson, do not choke the bottom of your board as it can slowly bake itself.
Also, sorry for the noise as in hindsight this issue doesn't seem to be related to T-Rex miner software at all.
Thank you for sharing your exeperience, i'm strugling with 3060 crashing after i start mining program, tried with different miners, and different coins, same issue, i will try your solution
First time posting...been reading for the past 5hrs for help...hopefully here I can find it.
I have been running 4 RTX 3060 12GB (3 Zotac Twin Edge OC & 1 Asus Tuf Gaming). For the most part unless I push hard I can mine on unmineable without any issues.
I tried doing dual & single mining by switching to Nicehash (using Trex) and the GPUs are unstable. Constantly crashing within 1min. Mining Trex Algo: Kawpow also crashes immidiatly. Gives me BLUE screen and then crashes the PC and restarts the PC.
So had to return mining using TREX Algo: ethash and only getting about 27MH/s per card
Help on:
My system info: . Win 10 64 bit
https://prnt.sc/129h3ok
my computer screen goes black for a few seconds, it returns with this error, and soon it shows a message GPU CRASH ERROR