Open mikoal1 opened 2 years ago
I'm having this same issue on my Hive rig. Previous uptime was 28 days with a different miner. I'm not experiencing any errors, the rig just locks up completely and cannot be remotely accessed to attempt restarting. GPU fans are still spinning but the miner is not running. I've tried lowering overclocks and using previous values that I know have been stable with no change in stability.
Yea, thats pretty much the same problem.
I've been watching the miner when it freezes
The logs dont produce anything too useful.
ten sam problem wen fix that
Hi there, i have the same issue with one of my rig. In my case i have three rigs with nvidia 3070 different brands.
1-6x3070 Palit Gamerock 2-6x3070 EVGA FTW3 3-6x3070 Nvidia 3070 GPU Bulk without Brand but looks like Gainward GPU without the name Gainward and my last purchase)
The rig number 1 and 2 run T Rex 0.25.12 and it´s ok! wihout issue and it´s mining since 3 days ago.
But the 3rd rig i have the issue. T Rex just freezing without reason like @BlaqkAugust while running T Rex 0.25.12. I´m testing this rig with T Rex 0.25.9 and works good.
Maybe is the GPU quality making t rex 0.25.12 freeze?
All GPUs are Non LHR
Show your oc settings
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: arusan84 @.> Sent: Saturday, April 23, 2022 8:53:12 PM To: trexminer/T-Rex @.> Cc: Subscribed @.***> Subject: Re: [trexminer/T-Rex] 0.25.12 crashing/freezing (Issue #1245)
All GPUs are Non LHR
— Reply to this email directly, view it on GitHubhttps://github.com/trexminer/T-Rex/issues/1245#issuecomment-1107515689, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM53SIPHOP67XMJLNROGLITVGQFFRANCNFSM5TZBCYWQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>
I encounter the same problem on one of my rig but working just find in another rig. The one has problem consist of two different manufacture (Zotac and Gainward), all card are 3070 TI. Whenever code 999 appear, my windows give me BSOD with several different error code, such as PAGE_FAULT_IN_NONPAGED_AREA, VIDEO_TDR_FAILURE, UNEXPECTED_KERNEL_MODE_TRAP etc.
What I did that doesn't work.
Hello! Had the same problems, too... frequent crashes and unexpected behaviour, when using 470.103 drivers with Linux. Updated to latest 510.60.02 solved all problems. Rock solid since then for 48 hours.
did you mean HiveOS or just regular Linux OS such as ubuntu or Centos? I will try to move to HiveOS...
Just regular Linux, although I guess it'll be the same problem under HiveOS...
Update! After 50 hours or so, it crashed the same way :(
20220424 17:36:05 TREX: Can't find nonce with device [ID=9, GPU #9], cuda exception: CUDA_ERROR_LAUNCH_FAILED, try to reduce overclock to stabilize GPU state 20220424 17:36:05 WARN: Miner is going to shutdown...
Where ID 9 is a 3080 with great cooling: [T:45/68C, P:234W]
Back to gminer, I guess. No dual mining, but rock solid from the past :(
Hi there, i have the same issue with one of my rig. In my case i have three rigs with nvidia 3070 different brands.
1-6x3070 Palit Gamerock 2-6x3070 EVGA FTW3 3-6x3070 Nvidia 3070 GPU Bulk without Brand but looks like Gainward GPU without the name Gainward and my last purchase)
The rig number 1 and 2 run T Rex 0.25.12 and it´s ok! wihout issue and it´s mining since 3 days ago.
But the 3rd rig i have the issue. T Rex just freezing without reason like @BlaqkAugust while running T Rex 0.25.12. I´m testing this rig with T Rex 0.25.9 and works good.
Maybe is the GPU quality making t rex 0.25.12 freeze?
Hi guys, updating.
When i run the 3rd rig in T rex 0.25.9 it was working fine until updated the new Hive Os Upgrade. The linux image change 5.4.0 to 5.10.xx (i don´t remember the last number) and it was a nightmare. Crashing and freezing again without reason and web it is OFFLINE.
. I decided to clean up all partition on my SSD, eliminate volume and reformat the SSD . Flashing and Reinstalling the Hive OS image with Balena Etcher but with linux kernel 5.4.0 . Updated Nvidia driver to 510.60.02 . Reconnect all hardware again (Riser, GPU, Cables) . Updated 0.6-214@220407
Now I´m mining again without crashing or freezing for now. It´s been one hour.
In my case
T Rex 0.25.12 with linux 5.4.0 and 5.10.xx crashing and freezing. T Rex 0.25.9 with linux 5.10.xx crashing and freezing too. T Rex 0.25.9 with linx 5.4.0 until now 1 hour and 15 minutes and not crashing o freezing.
I hope it help
Maybe i found the answer.
DON´T USE T REX 0.25.12 in GPUs low quality brands (Zotac, Gainward, etc). Crashing and freezing no matter linux kernel5.4.0 or 5.60.xx , hive os image.
Use another miner such as Gminer, LoLminer, NBminer and it will be ok and mine again without crashing
Maybe i found the answer.
DON´T USE T REX 0.25.12 in GPUs low quality brands (Zotac, Gainward, etc). Crashing and freezing no matter linux kernel5.4.0 or 5.60.xx , hive os image.
Use another miner such as Gminer, LoLminer, NBminer and it will be ok and mine again without crashing
I have Asus and EVGA along with MSI and it still freezes. So i dont think its the brand.
Im back on 0.25.09 and its runs fine. I think version 12 crashes my machine on a few areas
1) The LHR keeps getting detected but does not lower. So I increased the interval to 240 minutes. I see that in the version 13 test, they did the same where they increased the interval to 120 minutes 2) Even after I did the change to 240 min on version 12, it would still crash. For some reason it cannot accept the same OCs as the previous versions. I think that either lowering OC, or starting at a low LHR (77% or something) may help. But I'll have to run it for longer than 24 hours to know. As of right now, im just going back to version 9. Much less headache.
They released a test version trying to fix the Error 999 and crashes
Version 0.25.13 (TEST)
Bug fixes:
IMPORTANT: required drivers are 512.xx on Windows, and 510.xx on Linux. The unlocker will not work with older drivers. Windows 512.15 driver download: https://www.nvidia.com/Download/driverResults.aspx/187304/en-us HiveOS driver update command: nvidia-driver-update
Known issues: error 999 is still present, we'll be working on it in the next version.
Looking forward to your feedback in test-discussion. Thanks for testing!
Download Linux: https://trex-miner.com/download/test/t-rex-0.25.13-linux.tar.gz [SHA-256 checksum: 2c423d23c43c81939fce7357d0c64874f44f7ce2069815d0d6985c8784fa4a64] Windows: https://trex-miner.com/download/test/t-rex-0.25.13-win.zip [SHA-256 checksum: 131a9ec2d4d2294587348427285fbca10cae2f58da500cc41711385b61c95792]
Maybe i found the answer.
DON´T USE T REX 0.25.12 in GPUs low quality brands (Zotac, Gainward, etc). Crashing and freezing no matter linux kernel5.4.0 or 5.60.xx , hive os image.
Use another miner such as Gminer, LoLminer, NBminer and it will be ok and mine again without crashing
this answer my problem then... since I have a rig with Colorful brand (all the same) and its working absolutely fine with 0.25.12. but have big trouble with another rig that consist of Zotac and Gainward...
I don´t have LHR Cards on my rigs. Maybe it´s different No LHR and LHR cards.
Rig with Palit Gamerock running T rex 0.25.12 it´s ok! EVGA FTW3 running 0.25.12 it´s ok!.
Mining until now! today is 5 days mining.
The last one don´t have brand but for the serial number its a Gainward GPU. This one if i run T rex 25.12 in less a hour freezes the trex miner soft and the hive web getting offline. Maybe just crashing/freezing in gpus like gainward or Zotac where i found the issue. Others GPUs i dont have any trouble of any kind.
I don´t have LHR Cards on my rigs. Maybe it´s different No LHR and LHR cards.
Rig with Palit Gamerock running T rex 0.25.12 it´s ok! EVGA FTW3 running 0.25.12 it´s ok!.
Mining until now! today is 5 days mining.
The last one don´t have brand but for the serial number its a Gainward GPU. This one if i run T rex 25.12 in less a hour freezes the trex miner soft and the hive web getting offline. Maybe just crashing/freezing in gpus like gainward or Zotac where i found the issue. Others GPUs i dont have any trouble of any kind.
Have you tried to run just one or two GPU? mine is working just fine if it one or four GPUs with the same brand (Gainward/Zotac). but whenever I mix with others, then suddenly BSOD appears 🤪🤪🤪
I don´t have LHR Cards on my rigs. Maybe it´s different No LHR and LHR cards. Rig with Palit Gamerock running T rex 0.25.12 it´s ok! EVGA FTW3 running 0.25.12 it´s ok!. Mining until now! today is 5 days mining. The last one don´t have brand but for the serial number its a Gainward GPU. This one if i run T rex 25.12 in less a hour freezes the trex miner soft and the hive web getting offline. Maybe just crashing/freezing in gpus like gainward or Zotac where i found the issue. Others GPUs i dont have any trouble of any kind.
Have you tried to run just one or two GPU? mine is working just fine if it one or four GPUs with the same brand (Gainward/Zotac). but whenever I mix with others, then suddenly BSOD appears zany_facezany_facezany_face
Yea the less GPU the more stable. Ive since switched to version 9 and its been stable with all my cards.
After the following steps the miner seems to have stabilized and is running fine now for over 24H. Version 13 is surprisingly stable, once tuned, it doesnt trigger LHR.
After the above, it is night and day compared to version 10-12, it is comparable to 9's stability. I will have to leave it running for a week to see how stable it is, but as of now it is running fine. Version 13 did the trick
I'll leave this open until 1 week later to close if no more issues.
Just an update Rig 1 = Had error 999, switched over to Trex Test version 14 Rig 2= Miner froze up, needed to kill process, also swapped to version 14.
Version 13 is more stable than 10-12, but 9 is better so far. Will update on Version 14 after a few days (unless it crashes).
I also have had crashing issues with my single ASUS ROG STRIX RTX 3090, since version T-Rex 0.25.9 I think. Definitely crashes after a few hours with T-Rex 0.25.12 and T-Rex 0.25.15.
Types of crashes/freezes, these are my findings.
BSOD - Usually from an unstable OC, try lowering. LHR loop- resolved in V14, 15 Error 999 - Trex's bug, tried to disable hardware accel but didnt work. Increase power in nvidia control panel to max power, still doesnt resolve . Soft lock - Where you can still move the mouse and keyboard, but Trex.exe is bogging down everything. After some searching I found that perhaps monitoring software may be causing this like HWMonitor. When it softlocks, -1% appears on GPU usage. After turning off HWMonitor and uninstalling it, the only error that appears is error 999. Freeze - black screen, no power to usb port, frozen. I'm not sure what causes this, perhaps an OC setting. But other OC freezes I've seen would just restart MSI or crash and restart miner, not a complete freeze that requires Power cycle. Reducing OC seems to work, Its not the RAM or Mobo or PSU since another miner runs fine.
Other things I've done was kill updates any type of updates hoping to improve stability
After nearly a month of troubleshooting restarts/freezes since V 10 this is the best I can get it.
With the release of NBminer, I swapped over, no issues so far. If OC is too much, miner would restart and reset my MSI settings, and i know which one to lower. So far running 18h w/o any issues. But I wont know until 48h+ then i can determine if this is a hardware or Trex issue.
I'm starting to think its the way Trex accelerates and decelerates the hashes is whats causing my hardware to misbehave.
I'll provide one last update in 48h and close this out.
2 of my 3 rigs end up crashing or freezing when running 0.25.12, usually near the 24 hour mark.
I've tried lowering overclocks, core clocks etc. but it still freezes I cooled the room down to have the temps 35/78C, so temp doesnt seem to be an issue
I've installed 511.xx,512.xx,472.xx drivers. Still crashes
However the miner runs smoothly with 0.25.9
Errors I've received
When the computer freezes, the fan is still spinning, but the miner isn't running, remote login doesnt work.
I noticed the task manager was showing GPU 0 at 100% when it crashed.Has anyone encountered this issue? What could be causing this?