fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

Low hashrate on Xeon E5-2620 #1768

Open jmusac opened 6 years ago

jmusac commented 6 years ago

Hi,

I have system with 2 x Xeon E5-2620. And i'm mining Monero. On that system created Virtual machine and assigned 22 processor cores to it. I'm getting low hashrate on all processor cores. About 10 H/s. According to monerobenchmark site that should be about 30-ish H/s per core or total about 450 H/s per this processor.

The same thing happens on Windows Virtual machine or debian virtual machine. I attached picture below. hashrate

Spudz76 commented 6 years ago

I bet it has zero to do with being on VIRUTAL MACHINES? Try running it on the hypervisor side, if proxmox you're already Debian at least... and it works,

Spudz76 commented 6 years ago

You also have faaaaaaar too many threads configured, what did autoconf give? You don't run on the hyperthreading cores it kicks the 'real' core in the knees (HT cores are 'fake' as far as for hard computing, and share or more accurately fight for cache access which this uses heavily). If you have a v1 (as it would seem) then its 6 cores and 15MB cache, and you have to run 6 threads to use 12MB of the cache. So 12 threads total for the two cpus.

Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz 20MB cache, Proxmox based on Jessie 8.11

This is the autogen cpu.txt but I get better total hashrate by commenting out some cores. This only uses 12MB of 20 cache on each CPU, and this host runs about 13 VMs so it probably has to do with that. But I am running on the hypervisor side (main OS) not on any VM which will always be slower, emulation/sharing layers, unless you spend a lot of time making sure NUMA mapping is working all the way from VM to real cores. Without NUMA qemu emulates much of the things mining needs (pinned cache per core, pinned memory w/hugepage...). You also need to set the VM to map through the exact CPU as the hypervisor, not the "QEMU CPU" it advertises by default. But I still don't think you'd hit the same speeds as without any containerism, and that's a lot of tuning just to have mining in sandboxes (I'm not sure why you need VMs "in the way"). You can run a bunch of VMs that do other normal things, just mine on the main OS so it has control of full camping on whatever CPUs, it works great.

"cpu_threads_conf" :
[
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 0 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 1 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 2 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 3 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 4 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 5 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 6 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 7 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 16 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 17 },

//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 8 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 9 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 10 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 11 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 12 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 13 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 14 },
    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 15 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 24 },
//    { "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 25 },

],

Result sample:

HASHRATE REPORT - CPU
| ID |    10s |    60s |    15m | ID |    10s |    60s |    15m |
|  0 |   50.4 |   50.6 |   50.5 |  1 |   50.7 |   50.6 |   50.5 |
|  2 |   50.1 |   50.1 |   50.1 |  3 |   44.7 |   44.6 |   44.4 |
|  4 |   45.4 |   45.3 |   45.2 |  5 |   45.1 |   45.4 |   45.4 |
|  6 |   47.2 |   47.1 |   46.9 |  7 |   46.8 |   46.3 |   46.3 |
|  8 |   46.7 |   46.3 |   46.2 |  9 |   47.2 |   46.9 |   46.7 |
| 10 |   47.5 |   47.2 |   47.1 | 11 |   47.9 |   47.5 |   47.5 |
Totals (CPU):   569.6  568.0  566.6 H/s
-----------------------------------------------------------------
Totals (ALL):    569.6  568.0  566.6 H/s
Highest:   588.9 H/s
-----------------------------------------------------------------