fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

Hashrate of my GPU's dropped by 60% with v8 version #2021

Open EroTwenty opened 5 years ago

EroTwenty commented 5 years ago

Sorry, reposting with good Issue Styling, Hello, I have a lot of machines mining on XMR with quite old cards. I used to mine at around 115H/s each but with the v8 update, they now mine @50H/s each, is there a reason ? Or should I do something in the config files to get what I used to have ? (I saw that the hashrate of almost everyone dropped a bit but mine has suffered a really massive drop )

Thank you for your help,

Basic information

Compile issues

Issue with the execution

Stability issue

psychocrypt commented 5 years ago

ou can play around with the configs (start from the auto generated cfg) But since your gpus are very old do not expect v7 hash rates.

Eroide notifications@github.com schrieb am Fr., 26. Okt. 2018, 16:41:

Hello, I have a lot of machines mining on XMR with quite old cards. I used to mine at around 130H/s each but with the v8 update, they now mine @50H/s each, is there a reason ? Or should I do something in the config files to get what I used to have ? (I saw that the hashrate of almost everyone dropped a bit but mine has suffered a drop of 61% )

Thank you for your help Basic information

  • Intel Core i5 2500
  • AMD Radeon HD 8570 x 3
  • Graphics driver : radeon-crimson-15.12-win10-64bit (tried with latest and w/ may 2018 drivers : same issue or worse)

Compile issues

  • Windows 10 x64

add all commands you used and the full compile output here

run cmake -LA . in the build folder and add the output here

Issue with the execution

  • Do you compiled the miner by our own? Nope I didn't

run ./xmr-stak --version-long and add the output here

Stability issue

  • Is the CPU or GPU overclocked? No overclocking
  • Is the Main memory of the CPU or GPU undervolted? Nope !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fireice-uk/xmr-stak/issues/2021, or mute the thread https://github.com/notifications/unsubscribe-auth/AYsxtqqniUuCcpZ4CQF9es0wejhCsIp7ks5uox8IgaJpZM4X8X6h .

Spudz76 commented 5 years ago

Fermi (the nVidia equivalent in age) lost 50% and then gained 25% back so it's only 75% of v7 speeds (max). At 40% you may have some room to tweak but it will always be slower than v7. What does the generated amd.txt look like, for curiosity.

EroTwenty commented 5 years ago

Ok ! Thank you for your replies guys, any ideas where I could tweak a bit to gain a little more hashpower ?

Here's my amd.txt

 // generated by xmr-stak/2.5.0/90125127/master/win/nvidia-amd-cpu/20

/*
 * GPU configuration. You should play around with intensity and worksize as the fastest settings will vary.
 * index         - GPU index number usually starts from 0
 * intensity     - Number of parallel GPU threads (nothing to do with CPU threads)
 * worksize      - Number of local GPU threads (nothing to do with CPU threads)
 * affine_to_cpu - This will affine the thread to a CPU. This can make a GPU miner play along nicer with a CPU miner.
 * strided_index - switch memory pattern used for the scratch pad memory
 *                 2 = chunked memory, chunk size is controlled by 'mem_chunk'
 *                     required: intensity must be a multiple of worksize
 *                 1 or true  = use 16byte contiguous memory per thread, the next memory block has offset of intensity blocks
 *                             (for cryptonight_v8 and monero it is equal to strided_index = 0)
 *                 0 or false = use a contiguous block of memory per thread
 * mem_chunk     - range 0 to 18: set the number of elements (16byte) per chunk
 *                 this value is only used if 'strided_index' == 2
 *                 element count is computed with the equation: 2 to the power of 'mem_chunk' e.g. 4 means a chunk of 16 elements(256byte)
 * unroll        - allow to control how often the POW main loop is unrolled; valid range [1;128) - for most OpenCL implementations it must be a power of two.
 * comp_mode     - Compatibility enable/disable the automatic guard around compute kernel which allows
 *                 to use a intensity which is not the multiple of the worksize.
 *                 If you set false and the intensity is not multiple of the worksize the miner can crash:
 *                 in this case set the intensity to a multiple of the worksize or activate comp_mode.
 * "gpu_threads_conf" :
 * [
 *  { "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : true,
 *    "strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true },
 * ],
 * If you do not wish to mine with your AMD GPU(s) then use:
 * "gpu_threads_conf" :
 * null,
 */

"gpu_threads_conf" : [
  // gpu: Oland memory:640
  // compute units: 6
  { "index" : 0,
    "intensity" : 288, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 8, "comp_mode" : true
  },
  // gpu: Oland memory:640
  // compute units: 6
  { "index" : 1,
    "intensity" : 288, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 8, "comp_mode" : true
  },
  // gpu: Oland memory:640
  // compute units: 6
  { "index" : 2,
    "intensity" : 288, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
    "unroll" : 8, "comp_mode" : true
  },

],

/*
 * Platform index. This will be 0 unless you have different OpenCL platform - eg. AMD and Intel.
 */
"platform_index" : 0,
Simaex commented 5 years ago

@Eroide please note that miner detects only 640M memory. Oland GPU have not less then 2G of memory. The same situation with HD6990 results in huge performance impact and inability to use proper intencity. Check what driver are you using. HD6990 retain maximum performance with 14.4 driver, not the latest 15.7 If you'll find a solution please share.

Spudz76 commented 5 years ago

Yes old GCN have memory detect issues on drivers that are too new. Keep downgrading until the memory shows up correctly, and then it will start working better. I do not know the Windows drivers, I mine AMD on Linux (and the fglrx 15.3 or so driver is where I had to be for Redwood aka HD5770 to work) But windows vs linux driver versions don't cross over exactly with AMD, like nvidia drivers do. You may need 14.x as others suggest.

EroTwenty commented 5 years ago

Thanks again for your advices, so I tried to downgrade to older versions of the driver but it didn't change anything in the autogenerated amd.txt same low 60h/s (still same amount of memory apparently displayed, looks normal cause I use OEM Dell card which are 1gb)

BUT

With a bit of tweaking on amd.txt I was able to go back to a solid 100H/s per card by increasing the worksize to 16 with intensity @336, I tried a shitload of values and those ones are the best I could get (which is very nice, from 55H/s)

worksize : 16 intensity : 336

= 100H/s on Dell OEM HD 8570 w/ radeon-crimson-15.12-win10-64bit drivers