ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support
GNU General Public License v3.0
5.97k stars 2.28k forks source link

ethminer is not stable using 5700xt * 3 gpu #2043

Open blockchainapper opened 3 years ago

blockchainapper commented 3 years ago

Environment

Describe the bug when start, it's ok, log is:

i 02:58:18 Job: 06d680d5... cn.sparkpool.com [203.107.33.230:3333] i 02:58:19 Job: be5e36b1... cn.sparkpool.com [203.107.33.230:3333] i 02:58:19 Job: 2a7b6deb... cn.sparkpool.com [203.107.33.230:3333] i 02:58:19 Job: 0d98da89... cn.sparkpool.com [203.107.33.230:3333] m 02:58:19 1:54 A111:R1 148.32 Mh - cl0 49.44 0C 0% A37:R1, cl1 49.44 0C 0% A33, cl2 49.44 0C 0% A41

the 3 gpu fans is running normally. but, after some times, about 30 minutes, log is:

i 03:24:50 Job: 7f20e2db... cn.sparkpool.com [203.107.33.230:3333] i 03:24:50 Job: 3c1c8cec... cn.sparkpool.com [203.107.33.230:3333] m 03:24:50 2:20 A145:R3 174.78 Mh - cl0 142.14 0C 0% A53:R2, cl1 16.32 0C 0% A39:R1, cl2 16.32 0C 0% A53

hashrate is up to 174.78 M from 148.32 M. and a later, about some seconds, log is:

i 03:24:54 Job: 8e8aa851... cn.sparkpool.com [203.107.33.230:3333] i 03:24:55 Job: 792c572a... cn.sparkpool.com [203.107.33.230:3333] m 03:24:55 2:20 A145:R3 240.86 Mh - cl0 142.14 0C 0% A53:R2, cl1 49.36 0C 0% A39:R1, cl2 49.36 0C 0% A53

hashrate is up to 240.86 M from 174.78 M. and a later, about some seconds, log is:

i 03:24:58 Job: 74c697ab... cn.sparkpool.com [203.107.33.230:3333] cl 03:24:58 cl-2 Job: 74c697ab... Sol: 0x0a6301e0f915abeb i 03:24:58 **Accepted 44 ms. cn.sparkpool.com [203.107.33.230:3333] i 03:24:59 Job: 9e1471f9... cn.sparkpool.com [203.107.33.230:3333] m 03:25:00 2:20 A146:R3 98.93 Mh - cl0 0.00 0C 0% A53:R2, cl1 49.46 0C 0% A39:R1, cl2 49.46 0C 0% A54

hashrate is down to 98.93 M from 240.86 M. now one gpu is not working. and a later, about a hour, log is:

i 04:36:51 Job: ed264874... cn.sparkpool.com [203.107.33.230:3333] 0x00007FF8391E9E3A0 x00007FF8B91E9E36 (0x00000A717CB6BA28 0x000001705F4085CD 0x0000000000000000 0x0000000000000000) (0x000001672BA7CBA8 0x00000105867CDE80 0x0000000000000000 0x0000000000000000), aclWriteToMem() + 0x14E67FA bytes(s) 0x00007F38AEEF8907, aclWriteToMem() + 0x14E6876 bytes(s) (0x0000CDC5F2610D19 0x0000000FF7A5EE30 0x000000FF7B95EE30 0x000000585F17CD40), clGetPipeInfo() + 0x2022E bytes(s) 0x00007FF0548907D6 (0x00000000000C2700 0x00000185D7CD0570 0x000000E30F95F7BE 0x00000D0AF15907C0), clGetPipeInfo() + 0xB7C86 bytes(s)

the ethminer is shutdown. why? and how to resolve? but when I use some other miner, eg. qskg, the miner is stable. thanks!

qinapps commented 3 years ago

long time I reported this... see: "ISA for RX 5700 RDNA Kernel #2004" the rdna and rdna2 need a new kernel to run stable and effective

blockchainapper commented 3 years ago

long time I reported this... see: "ISA for RX 5700 RDNA Kernel #2004" the rdna and rdna2 need a new kernel to run stable and effective

thanks for your reply. but I don't see the solution.Have you resolved it ?

lss4 commented 3 years ago

I have this issue with my RX 5700 (50th Anniversary Edition) as well. This is on 5.8.8 kernel.

After a prolonged while (usually over 2-3 days), the hashrate would go down to about 1.5MH/s from 49MH/s. Restarting ethminer would fix it. I only use a single video card, though.

I don't think temperature is the issue as I'm currently managing a custom fan curve with radeon-profile. While mining, the fan will spin at max speed (really loud and noisy) with a stable temperature of 72 celsius.

Not sure if any other GPU-utilizing work may cause the miner to enter such a state. Even while mining the system is still being used for my usual web browsing, and the system is mostly responsive. From my usual experience, even some relatively GPU-intensive operations (like video playback) can be reliably performed while mining, and hashrate is only slightly impacted.

EDIT (probably off-topic): Not sure what those "kernels" are meant for. Currently there are no kernels for most recent video cards (Radeon VII gfx906 and RX 5700 gfx1010) so these cards will just use OpenCL.

blockchainapper commented 3 years ago

I have this issue with my RX 5700 (50th Anniversary Edition) as well. This is on 5.8.8 kernel.

After a prolonged while (usually over 2-3 days), the hashrate would go down to about 1.5MH/s from 49MH/s. Restarting ethminer would fix it. I only use a single video card, though.

I don't think temperature is the issue as I'm currently managing a custom fan curve with radeon-profile. While mining, the fan will spin at max speed (really loud and noisy) with a stable temperature of 72 celsius.

Not sure if any other GPU-utilizing work may cause the miner to enter such a state. Even while mining the system is still being used for my usual web browsing, and the system is mostly responsive. From my usual experience, even some relatively GPU-intensive operations (like video playback) can be reliably performed while mining, and hashrate is only slightly impacted.

EDIT (probably off-topic): Not sure what those "kernels" are meant for. Currently there are no kernels for most recent video cards (Radeon VII gfx906 and RX 5700 gfx1010) so these cards will just use OpenCL.

thank you for your reply!

blockchainapper commented 3 years ago

can anyone help us ?

BoriZka67 commented 3 years ago

I would also love a new core. Ready to act as a tester)

blockchainapper commented 3 years ago

I would also love a new core. Ready to act as a tester)

ok, maybe it's the fastest solution. but ethminer is open source and safe. open source is a good decision.

lss4 commented 3 years ago

I think one may try using the amdgpu-pro stack. For Manjaro, you need the 19.30 version of PKGBUILD, as 20.x currently can't boot properly in my case.

With amdgpu-pro instead of the open source stack, the DAG buffer gets filled within seconds. You may test a bit further provided the system is stable in overall.

EDIT: It doesn't seem to fix the issue... the video card suddenly dropped to 1.5MH/s again... I think ethminer really should incorporate a way to properly recover itself from various issues as anything can happen even in a supposedly stable environment...