Bendr0id / xmrigCC

RandomX, CryptoNight, Argon2 and GhostRider CPU/GPU miner with Command&Control (CC) Server and Monitoring
GNU General Public License v3.0
313 stars 112 forks source link

[2.2.0] RandomX severe performance degradation #285

Closed electroape closed 1 year ago

electroape commented 4 years ago

Just got two occurrences of this bug in a row. First time miner was running for a day or two and this happened just an hour ago, after restarting it, it again happened after 10 minutes. There was no such occurrences before, i was mining this algo since it was introduced (a month already i guess).

Algo RandomARQ. Hashrate drops severely (from 34kH to 600H) with a step half-way and load changes from user-space to kernel-space, see picture of task manager : https://imgur.com/a/RXbWn32

Nothing notable in the log

[2019-11-30 07:14:22.531] speed 10s/60s/15m 33915.0 33848.1 33950.2 H/s max 34390.9 H/s [2019-11-30 07:14:33.976] accepted (2356/0) diff 2042304 (102 ms) [2019-11-30 07:15:02.354] new job from arqma.herominers.com:10640 diff 2042304 algo rx/arq height 318554 [2019-11-30 07:15:22.578] speed 10s/60s/15m 33985.0 33892.6 33942.5 H/s max 34390.9 H/s [2019-11-30 07:16:12.647] accepted (2357/0) diff 2042304 (102 ms) [2019-11-30 07:16:22.627] speed 10s/60s/15m 34034.1 34012.2 33949.7 H/s max 34390.9 H/s [2019-11-30 07:17:22.679] speed 10s/60s/15m 34006.0 34022.8 33957.7 H/s max 34390.9 H/s [2019-11-30 07:18:22.708] speed 10s/60s/15m 28708.1 29081.1 33638.4 H/s max 34390.9 H/s [2019-11-30 07:18:44.330] new job from arqma.herominers.com:10640 diff 1467863 algo rx/arq height 318554 [2019-11-30 07:19:22.736] speed 10s/60s/15m 27913.1 28004.6 33233.2 H/s max 34390.9 H/s [2019-11-30 07:19:44.329] new job from arqma.herominers.com:10640 diff 733931 algo rx/arq height 318554 [2019-11-30 07:20:22.765] speed 10s/60s/15m 612.8 9344.4 31597.2 H/s max 34390.9 H/s [2019-11-30 07:20:44.332] new job from arqma.herominers.com:10640 diff 366965 algo rx/arq height 318554 [2019-11-30 07:21:22.794] speed 10s/60s/15m 591.8 615.3 29369.6 H/s max 34390.9 H/s [2019-11-30 07:21:44.328] new job from arqma.herominers.com:10640 diff 183475 algo rx/arq height 318554 [2019-11-30 07:21:44.999] new job from arqma.herominers.com:10640 diff 183475 algo rx/arq height 318555 [2019-11-30 07:22:14.591] new job from arqma.herominers.com:10640 diff 183475 algo rx/arq height 318556 [2019-11-30 07:22:22.824] speed 10s/60s/15m 629.0 612.2 27140.5 H/s max 34390.9 H/s [2019-11-30 07:22:44.327] new job from arqma.herominers.com:10640 diff 91739 algo rx/arq height 318556 [2019-11-30 07:23:22.850] speed 10s/60s/15m 606.0 613.3 24915.0 H/s max 34390.9 H/s [2019-11-30 07:23:44.329] new job from arqma.herominers.com:10640 diff 45869 algo rx/arq height 318556 [2019-11-30 07:23:55.327] accepted (2358/0) diff 45869 (101 ms) [2019-11-30 07:24:22.874] speed 10s/60s/15m 667.2 626.1 22686.4 H/s max 34390.9 H/s [2019-11-30 07:24:37.854] accepted (2359/0) diff 45869 (106 ms)

electroape commented 4 years ago

Forgot to mention that system becomes very unresponsive also

electroape commented 4 years ago

Windows 10 x64 1909, AMD Ryzen 7 3700X

electroape commented 4 years ago

This system is OC'd through, both on CPU and RAM but it was a week or so after last changes and it's running good otherwise, if it would repeat again i'd try to revert OC but i doubt it has smth to do with it.

electroape commented 4 years ago

No antivirus software also, Windows Defender is disabled.

Bendr0id commented 4 years ago

Uuuh this looks strange. Good finding.

Currently I can't make a sense out of it, but it looks somehow that either throttling takes place or cache is becoming very slow...

Is the system recovering after a while?

electroape commented 4 years ago

Not sure if it's recovering but it was running like that for about 2 hours the first time i've noticed. And no, reverting OC doesn't help, it's happened twice again, i'm switching to CN-Extremelite to check if it will act like that on it too. Maybe you can provide some debug build that will shed more light on that ?

electroape commented 4 years ago

I've also checked if Windows have updated in the last night (since it was fine beforehand), and no, the last updates were at 27th for some Samsung drivers (i don't have any Samsung hardware, that's strange).

electroape commented 4 years ago

It's running for 9 hours on Extremelite now without issues, so it's appears to depend on RandomX algos or RandomARQ specifically.

electroape commented 4 years ago

I'll try to switch to RandomWOW now.

electroape commented 4 years ago

This still happens regularly, any thoughts ?

electroape commented 4 years ago

I pretty sure only on my machine tho, so maybe it's related to AMD

electroape commented 4 years ago

Lol, twice in the last 30 minutes, after a week of so of normal operation, i have no idea what's wrong, it's completely random

Bendr0id commented 4 years ago

AMD.. hmm did you try to disable "Opcache" in BIOS?

electroape commented 4 years ago

Strangely enough i don't have that option on my board BIOS.

electroape commented 4 years ago

Reporting in, still happens on 2.2.2, still apparently only on my machine lol.

electroape commented 4 years ago

Grrr, three times today. I'm switching to upstream to see if there are any difference.

electroape commented 4 years ago

All fine as of yet, and strangely enough it runs cooler with better hashrate. There seem to be some improvements for AMD platforms judging by changelog. Any ETA on integrating them ? Maybe that'll fix this issue too.

Bendr0id commented 4 years ago

I'm currently looking into the msr performance optimizations.

Which version did you test? The unified miner?

electroape commented 4 years ago

Well, 5.4.0 GCC release on xmrig github. OpenCL\GPU mining are disabled.

electroape commented 4 years ago

Same issue on upstream, first time in a week tho.

electroape commented 4 years ago

Second time, i don't get it ...

ariadarkkkis commented 4 years ago

@uz-spark have you tried xmrig to see if you have the same issue with xmrig aswell? If the problem exist with latest xmrig 5.5.0, then there is an issue with your configuration.

electroape commented 4 years ago

Third time now, yes i'm on XMRig now, it was working fine for a week until now. My configuration are pretty standard.

ariadarkkkis commented 4 years ago

Then its configuration issue.Maybe Windows is not well configured. Try these:

  1. Manually set huge pages size. (Link here).
    1. Reboot system.
    2. Check for any running process that causes CPU usage or Ram usage.
    3. Disable Anti-Virus or Exclude and trust miner executable.

Maybe there are some hardware issues like Ram issue or maybe BIOS issue or CPU failure. Check BIOS settings aswell. Try restore BIOS to default settings and see if the problem is fixed.

Try Booting up a Live Linux and test miner on Linux. If the problem still exist in Linux, there is definitely something wrong either with BIOS settings or there is a hardware issue somewhere.

Last if none of the above worked, Try re-install Windows.

electroape commented 4 years ago
  1. It's set though group policy, miner confirms that looking at init.
  2. It's not related to system uptime.
  3. No other CPU heavy processes.
  4. No antivirus either, Windows Defender are disabled. Tried flashing BIOS with newer version. Tried disabling OC settings, doesn't seem to affect this. This is home PC and i'm not really Linux guy so going full-Linux isn't an option...
ariadarkkkis commented 4 years ago

Then do mem-test and see if your Rams have some issues.

Bendr0id commented 3 years ago

is this issue still present?