Bendr0id / xmrigCC

RandomX, CryptoNight, Argon2 and GhostRider CPU/GPU miner with Command&Control (CC) Server and Monitoring
GNU General Public License v3.0
312 stars 108 forks source link

[2.9.3] Incorrect CPU config "threads" option handling in some cases #351

Closed electroape closed 3 years ago

electroape commented 3 years ago

Case #1: 2x Xeon Gold 6248R, 2x24C48T, 2x35.75MB L3 Bare-metal Windows Server 2019 DC x64 Numa : https://pastebin.com/TjWwyMX0 Algo : issue is algorithm-independent If i just let the miner figure out auto config for it, i get about 60kH/s, it applies all-96-thread config. But if define CPU config manually, just putting ...

"*": {
    "threads": 96
}

... in the "cpu" section, i get 20kH/s and looking at the process monitor, the miner only hugs the first processor, all 48 threads of it, completely ignoring the second processor. I can't reproduce such behavior on other multi-socket systems i have so it's probably something related to the NUMA config of it.

Case #2: AMD EPYC 7452, 32C64T 128MB L3 Windows 2019 x64 guest on VMWare ESXi 7.0.2 hypervisor Numa : https://pastebin.com/hZ1s96aS Algo : CN-Heavy/XHV I get ~2kH/s with auto-config which uses 32 threads, but if i define the very same settings by again, putting ...

"*": {
    "threads": 32
}

... in the "cpu" section, i get about 1.6-1.7kH/s. Looking at the process monitor, it still assigns threads mostly according to the NUMA config but apparently uses one more thread than needed, or some misc thread has increased utilization. Screenshot with auto config : https://imgur.com/oFizaeo , and with manual config : https://imgur.com/RANz9rn, this thread behavior is consistent, it's not a measurement fluke.

electroape commented 3 years ago

Forgot to add that, upstream (6.12.1) behaves as expected (no hashrate drop) in case 1 but in the case 2 it behaves the same as XMRigCC, (hashrate drop is present) and i see in the console printout that core affinities have changed to -1.

Bendr0id commented 3 years ago

If you define manual selection (threads) and don't specify more options like affinity, it will use -1 which is let the is decide.. I think xmrig stock does the same..

Are you sure that the hashrate drop isn't a coincidence..? Multiple restart of the miners may vary a lot in hashrate...

It's hard to debug these things..

electroape commented 3 years ago

I understand that if do not define "affinity" then it figures out what to do by itself, it's just that i'd expect the behavior to match such of all-auto config, i.e assigning the best affinities for each thread, not the '"affinity": -1' behavior. I know that upstream does so too, it's just not very logical. As of the case 1, yes, the hashrate drop is consistent, because as i said, it just doesn't use the second CPU at all, apparently assigning all 96 threads across 48 logical threads of the first CPU. I have three other multi-socket systems and non of them shows such behavior. It's not that important tho, i can still define threads config with a different syntax successfully.

Bendr0id commented 3 years ago

There was a typo in my last comment.

-1 means not let the miner decide. It means let the OS decide or in other words "no affinity is set".

Not using the 2nd socket for affinity but recognizing all possible threads and then just affine it to the first socket is indeed an issue if it would be full autoconfig.

Actually the miner does either support full autoconfig OR full manual config. There is no mixture implemented which lets you define the amount of threads but auto affinity.

So if you want to define the threads manually, you also need to define the affinity manually.

The logic to find out which thread you mean on which socket and affine a thread to it is quite complicated. Try to play with the max-thread-hint feature to find the right amount of threads you want to have and let it then auto define the right affinity for it.

electroape commented 3 years ago

Yeah, "max-threads-hint" works too. Anything i can do to debug the first issue ?

Bendr0id commented 3 years ago

I think the root cause is the same.. broken affinity. The question is why the utilisation works better on upstream. I don't see a reason for that 🤔

Bendr0id commented 3 years ago

works as designed