trexminer / T-Rex

T-Rex NVIDIA GPU miner with web control monitoring page
2.64k stars 439 forks source link

WARN: NVML: can't get GPU #1, error code 15 error in trex miner as well as phoenixminer #642

Open Headsimon opened 3 years ago

Headsimon commented 3 years ago

Guys i am having two GPUs, 3060ti Founder edition and 3060ti Gigabyte edition. Gigabyte one was older on which i have mined for 3 months with no issues. However after mounting it on riser, an error is coming up at random times. Sometimes after 2 hours, sometimes after 2 days. I was using PhoenixMiner. Error goes like this, "hwmc GPU2: unable to get GPU fan speed- GPU is lost (15)" and then it restarts the PC. The riser used by me is "https://www.amazon.in/PiPlusTM-VER009S-Powered-Adapter-Extension/dp/B08YXHPZXL/ref=sr_1_4?dchild=1&keywords=piplus+riser&qid=1632508986&sr=8-4". Please check. I am a newbie therefore dont know much technicalities. However i have powered riser and gpu from different PCIe cables from psu. I will try increasing fan speed but the issue is that it will run for a day or two nonstop and then suddenly in every 2 hours i get this error and miner reboots. Today it restarted the pc at 6:57, 8:57, 10:57 and so on. This is one log file for your reference,

2021.09.24:22:57:08.377: eths Eth: New job #efefc125 from ssl://asia1.ethermine.org:5555; diff: 4295MH 2021.09.24:22:57:10.022: hwmc GPU2: unable to get fan speed - GPU is lost (15) 2021.09.24:22:57:10.065: main GPU1: 68C 72% 140W, GPU2: 67C GPUs power: 139.9 W; 428 kH/J 2021.09.24:22:57:11.320: main Eth speed: 59.923 MH/s, shares: 184/1/0, time: 1:58 2021.09.24:22:57:11.320: main GPUs: 1: 59.923 MH/s (90) 2: 0.000 MH/s (95)

2021.09.24:22:57:15.871: main 1:58 9/24 22:57 ** 2021.09.24:22:57:15.871: main Eth: Mining ETH on ssl://asia1.ethermine.org:5555 for 1:16 2021.09.24:22:57:15.871: main Eth: Accepted shares 184 (1 stales), rejected shares 1 (0 stales) 2021.09.24:22:57:15.871: main Eth: Incorrect shares 0 (0.00%), est. stales percentage 0.54% 2021.09.24:22:57:15.871: main Eth: Maximum difficulty of found share: 2716.9 GH (!) 2021.09.24:22:57:15.871: main Eth: Average speed (5 min): 112.418 MH/s 2021.09.24:22:57:15.871: main Eth: Effective speed: 111.47 MH/s; at pool: 110.87 MH/s 2021.09.24:22:57:15.871: main

after this mining is continued for 30 seconds and then it restarts the PC with windows failure of NO CARD DETECTED. Is the problem with riser? I tried changing the position of GPUs i.e. replaced two GPUs with one another and still the problem persists. Now i switched to Trex miner. My OC settings are Power limit 70, core clock tried a whole range of 0,-200,-300,-502, mem clock 900,950,1000,1100,1150,1200 fan speed such that it stays around 67 C. After rotating GPUs i think problem is isolated with riser and its power cables. I use 850 W Antec HCGGold PSU. The power cables for riser and GPU are both separate.

I think its defective riser but not able to see how after restart problem vanishes. In fact i ordered this riser yesterday, https://www.amazon.in/Tapia-V009S-Plus-Indicator-Extension/dp/B097R24JFB/ref=sr_1_5?dchild=1&keywords=riser&qid=1632540537&sr=8-5. Should i cancel it and reorder? This is the another 010s riser i could find https://www.vedantcomputers.com/riser-extender-ver0010s-plus-8-capacitor-ultra-stable-extra-led-16-into-to%20-1-into-power-pcie-80cm-usb-3point0-cable?search=riser.

Andr65cmd commented 2 years ago

Did you find any solution ?

Headsimon commented 2 years ago

Did you find any solution ?

Yes i changed the riser and it works flawlessly. However for your reference, when i switched to Trex miner and modified some code as follows, it worked with quite less errors.

--gpu-init-mode 1 --keep-gpu-busy --no-nvml