fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

Does XMR-Stak support AVX on Opteron 4300/6300 processors? #2170

Open nutsnax opened 5 years ago

nutsnax commented 5 years ago

CPU: 2x Opteron 4365EE RAM: 4GB ECC DDR3 (2x2GB) Other info: HT Assist disabled and full 16MB L3 cache reported

Does XMR-Stak support AVX on Opteron 4300/6300 series processors? Looking in the code I only see a comment referencing Ryzen 1xxx/2xxx series CPU's having AVX initialized (amd_avx) but I don't see any reference to opterons or older FX cpus. Can anyone clarify?

Spudz76 commented 5 years ago

If it doesn't crash with it enabled, then it supports it, similar to AES or SSSE3. It is optimized for (R&D tested on) Ryzen 1xxx. I have used amd_avx on some Intels and gotten reasonable results, even. But also it somewhat isn't even the best on all Ryzens (newer ones don't act quite the same as older ones). You may get better performance from a locally compiled version, and asm:off because then the compiler handles what it thinks your CPU would like best (and can change per compiler type and version). But sometimes compilers don't know certain cache specifics or tricks like rolling two 64-bit actions into a 128-bit access so one function call consumes to chunks per round, etc. Those are precisely the things that got optimized in the amd_avx kernel.

AMD likes to switch up how its internal components interact very often. Things the miner needs to do are very sensitive to being able to lock itself in cache and operate in registers so it doesn't need to hit the memory bus at all. How the cores share the cache have a lot to do with how the assembly code should try/avoid certain maneuvers - but that is very highly CPU dependent. The developer only had whichever Ryzen to work with, so it works really well on whichever one that was.

It definitely really sucks on anything Hammer or previous (possibly also FX) due to weak integer arithmetic unit more than cache issues. Those were faster than Intel for gaming due to better floating-point units but mining uses integers and actively avoids floating point math - even the new sqrt operation is a low precision integer version.

nutsnax commented 5 years ago

Hey thanks for the clarification I appreciate it. Whether asm is off of set to amd_avx doesn't seem to matter. I guess this means that stak is automatically "finding" avx on my CPU and utilizing it?

Spudz76 commented 5 years ago

It will say it selected amd_avx (or intel_avx) when it does (and asm:auto)

If it says nothing about it then it is using asm:off

You can force a kernel by setting it by name (disables detection). Probably without manual asm:amd_avx it won't use it. I generally just try all three and pick the best benchmark.