fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.8k forks source link

Miner stalls on first run of freshly installed Vega FE card #1133

Closed jam3972 closed 6 years ago

jam3972 commented 6 years ago

Basic information

Reproduction steps

-DDU drivers -install Adrenalin drivers using AMD installer -install Blockchain drivers using inf update method (this is the generally accepted process to install drivers for VEGA FE cards to run at full hash) -run miner

Issue Example

here is xmr-stak output from build 7b850646 when this issue happens on a 2 VEGA FE card system: (Same thing happens with new pr #1131. It stalls after Device X and before getting to the cache load step on each new card)

Run 1, (stalls, have to manually close)

[2018-02-22 12:28:05] : Start mining: MONERO
[2018-02-22 12:28:05] : Compiling code and initializing GPUs. This will take a while...
[2018-02-22 12:28:05] : Device 0 work size 8 / 32.

Run 2, (stalls, have to manually close)

[2018-02-22 12:28:05] : Start mining: MONERO
[2018-02-22 12:28:05] : Compiling code and initializing GPUs. This will take a while...
[2018-02-22 12:28:05] : Device 0 work size 8 / 32.
[2018-02-22 12:28:23] : Device 0 work size 8 / 32.
[2018-02-22 12:28:38] : Device 1 work size 8 / 32.

Run 3, (Successful run. also any subsequent run until drivers are re-installed)

[2018-02-22 12:28:05] : Start mining: MONERO
[2018-02-22 12:28:05] : Compiling code and initializing GPUs. This will take a while...
[2018-02-22 12:28:05] : Device 0 work size 8 / 32.
[2018-02-22 12:28:23] : Device 0 work size 8 / 32.
[2018-02-22 12:28:38] : Device 1 work size 8 / 32.
[2018-02-22 12:28:52] : Device 1 work size 8 / 32.
[2018-02-22 12:29:05] : Starting AMD GPU thread 0, no affinity.
[2018-02-22 12:29:05] : Starting AMD GPU thread 1, no affinity.
[2018-02-22 12:29:05] : Starting AMD GPU thread 2, no affinity.
[2018-02-22 12:29:05] : Starting AMD GPU thread 3, no affinity.

Notes

-Same thing happens with Cast-XMR -Same issue as #1099 but attempting to re-open following template -Not sure if this happens with RX Vega 56/64 cards

psychocrypt commented 6 years ago

It looks like a driver issue. The problem is that I have no vega gpu's therefore we can not reproduce the issue. Are you able to reproduce the issue each time you reinstall the driver? If so I can create a branch for you and at some debug output into the initialization prozess to locate the place where the miner freaze. Please give me a sign if I should create a debug branch and if you like to reinstall your driver to test it.

jam3972 commented 6 years ago

Yes the issue is 100% reproducible. I would happily test a debug branch to figure it out.

psychocrypt commented 6 years ago

Please compile and run https://github.com/psychocrypt/xmr-stak/archive/topic-openclFreezDebug.zip with a fresh install driver. I added a lot of debug output to detect to position in the code where the miner hangs. This will be the first step to debug this issue. Please post the full output of the miner for the first and second start after the fresh driver install.

jam3972 commented 6 years ago

On my 2x Vega FE system, running 2 threads per card: 1st run (stall 1):

[2018-03-04 02:06:15] : Start mining: MONERO
[2018-03-04 02:06:15] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 02:06:15] : Yes we can 1
[2018-03-04 02:06:15] : Yes we can 2
[2018-03-04 02:06:15] : Yes we can 3
[2018-03-04 02:06:15] : Yes we can 4
[2018-03-04 02:06:15] : Yes we can 5
[2018-03-04 02:06:15] : Yes we can 6
[2018-03-04 02:06:15] : Yes we can 7
[2018-03-04 02:06:15] : Yes we can 8
[2018-03-04 02:06:15] : Yes we can 9
[2018-03-04 02:06:15] : Yes we can 9
[2018-03-04 02:06:15] : Yes we can 10
[2018-03-04 02:06:15] : Yes we can 11
[2018-03-04 02:06:15] : Yes we can 12
[2018-03-04 02:06:15] : Yes we can 13
[2018-03-04 02:06:15] : Yes we can 14
[2018-03-04 02:06:15] : Yes we can 16
[2018-03-04 02:06:15] : Yes we can 17
[2018-03-04 02:06:15] : Device 0 work size 8 / 32.

2nd run (stall 2):

[2018-03-04 02:08:17] : Start mining: MONERO
[2018-03-04 02:08:17] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 02:08:17] : Yes we can 1
[2018-03-04 02:08:17] : Yes we can 2
[2018-03-04 02:08:17] : Yes we can 3
[2018-03-04 02:08:18] : Yes we can 4
[2018-03-04 02:08:18] : Yes we can 5
[2018-03-04 02:08:18] : Yes we can 6
[2018-03-04 02:08:18] : Yes we can 7
[2018-03-04 02:08:18] : Yes we can 8
[2018-03-04 02:08:18] : Yes we can 9
[2018-03-04 02:08:18] : Yes we can 9
[2018-03-04 02:08:18] : Yes we can 10
[2018-03-04 02:08:18] : Yes we can 11
[2018-03-04 02:08:18] : Yes we can 12
[2018-03-04 02:08:18] : Yes we can 13
[2018-03-04 02:08:18] : Yes we can 14
[2018-03-04 02:08:18] : Yes we can 16
[2018-03-04 02:08:18] : Yes we can 17
[2018-03-04 02:08:18] : Device 0 work size 8 / 32.
[2018-03-04 02:08:18] : Yes we can 18
[2018-03-04 02:08:18] : Yes we can 18
[2018-03-04 02:08:18] : Yes we can 19
[2018-03-04 02:08:18] : Yes we can 20
[2018-03-04 02:08:18] : Yes we can 21
[2018-03-04 02:08:18] : Yes we can 22
[2018-03-04 02:08:18] : Yes we can 23
[2018-03-04 02:08:18] : Yes we can 24
[2018-03-04 02:08:18] : Yes we can 25
[2018-03-04 02:08:18] : Yes we can 26
[2018-03-04 02:08:18] : Yes we can 27
[2018-03-04 02:08:30] : Yes we can 28
[2018-03-04 02:08:46] : Yes we can 29
[2018-03-04 02:08:46] : Yes we can 30
[2018-03-04 02:08:46] : Yes we can 31
[2018-03-04 02:08:47] : Yes we can 32
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 33
[2018-03-04 02:08:47] : Yes we can 34
[2018-03-04 02:08:47] : Yes we can 35
[2018-03-04 02:08:47] : Yes we can 15
[2018-03-04 02:08:47] : Yes we can 14
[2018-03-04 02:08:47] : Yes we can 16
[2018-03-04 02:08:47] : Yes we can 17
[2018-03-04 02:08:47] : Device 0 work size 8 / 32.
[2018-03-04 02:08:47] : Yes we can 18
[2018-03-04 02:08:47] : Yes we can 18
[2018-03-04 02:08:47] : Yes we can 19
[2018-03-04 02:08:47] : Yes we can 20
[2018-03-04 02:08:47] : Yes we can 21
[2018-03-04 02:08:47] : Yes we can 22
[2018-03-04 02:08:47] : Yes we can 23
[2018-03-04 02:08:47] : Yes we can 24
[2018-03-04 02:08:47] : Yes we can 25
[2018-03-04 02:08:47] : Yes we can 26
[2018-03-04 02:08:47] : Yes we can 27
[2018-03-04 02:09:00] : Yes we can 28
[2018-03-04 02:09:00] : Yes we can 29
[2018-03-04 02:09:00] : Yes we can 30
[2018-03-04 02:09:00] : Yes we can 31
[2018-03-04 02:09:01] : Yes we can 32
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 33
[2018-03-04 02:09:01] : Yes we can 34
[2018-03-04 02:09:01] : Yes we can 35
[2018-03-04 02:09:01] : Yes we can 15
[2018-03-04 02:09:01] : Yes we can 14
[2018-03-04 02:09:01] : Yes we can 16
[2018-03-04 02:09:01] : Yes we can 17
[2018-03-04 02:09:01] : Device 1 work size 8 / 32.

3rd run (success):

[2018-03-04 02:10:34] : Start mining: MONERO
[2018-03-04 02:10:34] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 02:10:34] : Yes we can 1
[2018-03-04 02:10:34] : Yes we can 2
[2018-03-04 02:10:34] : Yes we can 3
[2018-03-04 02:10:34] : Yes we can 4
[2018-03-04 02:10:34] : Yes we can 5
[2018-03-04 02:10:34] : Yes we can 6
[2018-03-04 02:10:34] : Yes we can 7
[2018-03-04 02:10:34] : Yes we can 8
[2018-03-04 02:10:34] : Yes we can 9
[2018-03-04 02:10:34] : Yes we can 9
[2018-03-04 02:10:34] : Yes we can 10
[2018-03-04 02:10:34] : Yes we can 11
[2018-03-04 02:10:34] : Yes we can 12
[2018-03-04 02:10:34] : Yes we can 13
[2018-03-04 02:10:34] : Yes we can 14
[2018-03-04 02:10:34] : Yes we can 16
[2018-03-04 02:10:34] : Yes we can 17
[2018-03-04 02:10:34] : Device 0 work size 8 / 32.
[2018-03-04 02:10:34] : Yes we can 18
[2018-03-04 02:10:34] : Yes we can 18
[2018-03-04 02:10:34] : Yes we can 19
[2018-03-04 02:10:34] : Yes we can 20
[2018-03-04 02:10:34] : Yes we can 21
[2018-03-04 02:10:34] : Yes we can 22
[2018-03-04 02:10:34] : Yes we can 23
[2018-03-04 02:10:34] : Yes we can 24
[2018-03-04 02:10:34] : Yes we can 25
[2018-03-04 02:10:34] : Yes we can 26
[2018-03-04 02:10:34] : Yes we can 27
[2018-03-04 02:10:47] : Yes we can 28
[2018-03-04 02:10:47] : Yes we can 29
[2018-03-04 02:10:47] : Yes we can 30
[2018-03-04 02:10:47] : Yes we can 31
[2018-03-04 02:10:48] : Yes we can 32
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 33
[2018-03-04 02:10:48] : Yes we can 34
[2018-03-04 02:10:48] : Yes we can 35
[2018-03-04 02:10:48] : Yes we can 15
[2018-03-04 02:10:48] : Yes we can 14
[2018-03-04 02:10:48] : Yes we can 16
[2018-03-04 02:10:48] : Yes we can 17
[2018-03-04 02:10:48] : Device 0 work size 8 / 32.
[2018-03-04 02:10:48] : Yes we can 18
[2018-03-04 02:10:48] : Yes we can 18
[2018-03-04 02:10:48] : Yes we can 19
[2018-03-04 02:10:48] : Yes we can 20
[2018-03-04 02:10:48] : Yes we can 21
[2018-03-04 02:10:48] : Yes we can 22
[2018-03-04 02:10:48] : Yes we can 23
[2018-03-04 02:10:48] : Yes we can 24
[2018-03-04 02:10:48] : Yes we can 25
[2018-03-04 02:10:48] : Yes we can 26
[2018-03-04 02:10:48] : Yes we can 27
[2018-03-04 02:11:00] : Yes we can 28
[2018-03-04 02:11:00] : Yes we can 29
[2018-03-04 02:11:00] : Yes we can 30
[2018-03-04 02:11:00] : Yes we can 31
[2018-03-04 02:11:01] : Yes we can 32
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 33
[2018-03-04 02:11:01] : Yes we can 34
[2018-03-04 02:11:01] : Yes we can 35
[2018-03-04 02:11:01] : Yes we can 15
[2018-03-04 02:11:01] : Yes we can 14
[2018-03-04 02:11:01] : Yes we can 16
[2018-03-04 02:11:01] : Yes we can 17
[2018-03-04 02:11:01] : Device 1 work size 8 / 32.
[2018-03-04 02:11:01] : Yes we can 18
[2018-03-04 02:11:01] : Yes we can 18
[2018-03-04 02:11:01] : Yes we can 19
[2018-03-04 02:11:01] : Yes we can 20
[2018-03-04 02:11:01] : Yes we can 21
[2018-03-04 02:11:01] : Yes we can 22
[2018-03-04 02:11:01] : Yes we can 23
[2018-03-04 02:11:01] : Yes we can 24
[2018-03-04 02:11:01] : Yes we can 25
[2018-03-04 02:11:01] : Yes we can 26
[2018-03-04 02:11:01] : Yes we can 27
[2018-03-04 02:11:14] : Yes we can 28
[2018-03-04 02:11:14] : Yes we can 29
[2018-03-04 02:11:14] : Yes we can 30
[2018-03-04 02:11:14] : Yes we can 31
[2018-03-04 02:11:15] : Yes we can 32
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 33
[2018-03-04 02:11:15] : Yes we can 34
[2018-03-04 02:11:15] : Yes we can 35
[2018-03-04 02:11:15] : Yes we can 15
[2018-03-04 02:11:15] : Yes we can 14
[2018-03-04 02:11:15] : Yes we can 16
[2018-03-04 02:11:15] : Yes we can 17
[2018-03-04 02:11:15] : Device 1 work size 8 / 32.
[2018-03-04 02:11:15] : Yes we can 18
[2018-03-04 02:11:15] : Yes we can 18
[2018-03-04 02:11:15] : Yes we can 19
[2018-03-04 02:11:15] : Yes we can 20
[2018-03-04 02:11:15] : Yes we can 21
[2018-03-04 02:11:15] : Yes we can 22
[2018-03-04 02:11:15] : Yes we can 23
[2018-03-04 02:11:15] : Yes we can 24
[2018-03-04 02:11:15] : Yes we can 25
[2018-03-04 02:11:15] : Yes we can 26
[2018-03-04 02:11:15] : Yes we can 27
[2018-03-04 02:11:27] : Yes we can 28
[2018-03-04 02:11:27] : Yes we can 29
[2018-03-04 02:11:27] : Yes we can 30
[2018-03-04 02:11:27] : Yes we can 31
[2018-03-04 02:11:28] : Yes we can 32
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 33
[2018-03-04 02:11:28] : Yes we can 34
[2018-03-04 02:11:28] : Yes we can 35
[2018-03-04 02:11:28] : Yes we can 15
[2018-03-04 02:11:28] : Starting AMD GPU thread 0, no affinity.
[2018-03-04 02:11:28] : Starting AMD GPU thread 1, no affinity.
[2018-03-04 02:11:28] : Starting AMD GPU thread 2, no affinity.
[2018-03-04 02:11:28] : Starting AMD GPU thread 3, no affinity.

my amd.txt for reference:


/*
 * GPU configuration. You should play around with intensity and worksize as the fastest settings will vary.
 * index         - GPU index number usually starts from 0
 * intensity     - Number of parallel GPU threads (nothing to do with CPU threads)
 * worksize      - Number of local GPU threads (nothing to do with CPU threads)
 * affine_to_cpu - This will affine the thread to a CPU. This can make a GPU miner play along nicer with a CPU miner.
 * strided_index - switch memory pattern used for the scratch pad memory
 *                 2 = chunked memory, chunk size is controlled by 'mem_chunk'
 *                     required: intensity must be a multiple of worksize
 *                 1 or true  = use 16byte contiguous memory per thread, the next memory block has offset of intensity blocks
 *                 0 or false = use a contiguous block of memory per thread
 * mem_chunk     - range 0 to 18: set the number of elements (16byte) per chunk
 *                 this value is only used if 'strided_index' == 2
 *                 element count is computed with the equation: 2 to the power of 'mem_chunk' e.g. 4 means a chunk of 16 elements(256byte)
 * comp_mode     - Compatibility enable/disable the automatic guard around compute kernel which allows
 *                 to use a intensity which is not the multiple of the worksize.
 *                 If you set false and the intensity is not multiple of the worksize the miner can crash:
 *                 in this case set the intensity to a multiple of the worksize or activate comp_mode.
 * "gpu_threads_conf" :
 * [
 *  { "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false, "strided_index" : true, "mem_chunk" : 2, "comp_mode" : true },
 * ],
 * If you do not wish to mine with your AMD GPU(s) then use:
 * "gpu_threads_conf" :
 * null,
 */

"gpu_threads_conf" : [
  // gpu: gfx901 memory:15822
  // compute units: 64
  { "index" : 0,
    "intensity" : 2016, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2,
    "comp_mode" : true
  },
  { "index" : 0,
    "intensity" : 2016, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2,
    "comp_mode" : true
  },
  // gpu: gfx901 memory:15822
  // compute units: 64
  { "index" : 1,
    "intensity" : 2016, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2,
    "comp_mode" : true
  },
  { "index" : 1,
    "intensity" : 2016, "worksize" : 8,
    "affine_to_cpu" : false, "strided_index" : 1, "mem_chunk" : 2,
    "comp_mode" : true
  },

],

/*
 * Platform index. This will be 0 unless you have different OpenCL platform - eg. AMD and Intel.
 */
"platform_index" : 2,
psychocrypt commented 6 years ago

Thanks for the tests, I located the command which hangs but I am not sure why. I pushed a change to the same test branch and hope this little change helps. Could you please try again if the miner freeze after the driver update.

jam3972 commented 6 years ago

Sure same download link right? Looks like same problem. No difference that I can notice.

Run 1

xmr-stak 2.2.0 94853cc0

Brought to you by fireice_uk and psychocrypt under GPLv3.
Based on CPU mining code by wolf9466 (heavily optimized by fireice_uk).
Based on NVIDIA mining code by KlausT and psychocrypt.
Based on OpenCL mining code by wolf9466.

Configurable dev donation level is set to 2.0%

You can use following keys to display reports:
'h' - hashrate
'r' - results
'c' - connection
-------------------------------------------------------------------
[2018-03-04 09:43:27] : Start mining: MONERO
[2018-03-04 09:43:27] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 09:43:27] : Yes we can 1
[2018-03-04 09:43:27] : Yes we can 2
[2018-03-04 09:43:27] : Yes we can 3
[2018-03-04 09:43:28] : Yes we can 4
[2018-03-04 09:43:28] : Yes we can 5
[2018-03-04 09:43:28] : Yes we can 6
[2018-03-04 09:43:28] : Yes we can 7
[2018-03-04 09:43:28] : Yes we can 8
[2018-03-04 09:43:28] : Yes we can 9
[2018-03-04 09:43:28] : Yes we can 9
[2018-03-04 09:43:28] : Yes we can 10
[2018-03-04 09:43:28] : Yes we can 11
[2018-03-04 09:43:28] : Yes we can 12
[2018-03-04 09:43:28] : Yes we can 13
[2018-03-04 09:43:28] : Yes we can 14
[2018-03-04 09:43:28] : Yes we can 16
[2018-03-04 09:43:28] : Yes we can 17
[2018-03-04 09:43:28] : Device 0 work size 8 / 32.

Run 2

[2018-03-04 09:45:58] : Start mining: MONERO
[2018-03-04 09:45:58] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 09:45:58] : Yes we can 1
[2018-03-04 09:45:58] : Yes we can 2
[2018-03-04 09:45:58] : Yes we can 3
[2018-03-04 09:45:59] : Yes we can 4
[2018-03-04 09:45:59] : Yes we can 5
[2018-03-04 09:45:59] : Yes we can 6
[2018-03-04 09:45:59] : Yes we can 7
[2018-03-04 09:45:59] : Yes we can 8
[2018-03-04 09:45:59] : Yes we can 9
[2018-03-04 09:45:59] : Yes we can 9
[2018-03-04 09:45:59] : Yes we can 10
[2018-03-04 09:45:59] : Yes we can 11
[2018-03-04 09:45:59] : Yes we can 12
[2018-03-04 09:45:59] : Yes we can 13
[2018-03-04 09:45:59] : Yes we can 14
[2018-03-04 09:45:59] : Yes we can 16
[2018-03-04 09:45:59] : Yes we can 17
[2018-03-04 09:45:59] : Device 0 work size 8 / 32.
[2018-03-04 09:45:59] : Yes we can 18
[2018-03-04 09:45:59] : Yes we can 18
[2018-03-04 09:45:59] : Yes we can 19
[2018-03-04 09:45:59] : Yes we can 20
[2018-03-04 09:45:59] : Yes we can 21
[2018-03-04 09:45:59] : Yes we can 22
[2018-03-04 09:45:59] : Yes we can 23
[2018-03-04 09:45:59] : Yes we can 24
[2018-03-04 09:45:59] : Yes we can 25
[2018-03-04 09:45:59] : Yes we can 26
[2018-03-04 09:45:59] : Yes we can 27
[2018-03-04 09:46:12] : Yes we can 28
[2018-03-04 09:46:12] : Yes we can 29
[2018-03-04 09:46:12] : Yes we can 30
[2018-03-04 09:46:12] : Yes we can 31
[2018-03-04 09:46:13] : Yes we can 32
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 33
[2018-03-04 09:46:13] : Yes we can 34
[2018-03-04 09:46:13] : Yes we can 35
[2018-03-04 09:46:13] : Yes we can 15
[2018-03-04 09:46:13] : Yes we can 14
[2018-03-04 09:46:13] : Yes we can 16
[2018-03-04 09:46:13] : Yes we can 17
[2018-03-04 09:46:13] : Device 0 work size 8 / 32.
[2018-03-04 09:46:13] : Yes we can 18
[2018-03-04 09:46:13] : Yes we can 18
[2018-03-04 09:46:13] : Yes we can 19
[2018-03-04 09:46:13] : Yes we can 20
[2018-03-04 09:46:13] : Yes we can 21
[2018-03-04 09:46:13] : Yes we can 22
[2018-03-04 09:46:13] : Yes we can 23
[2018-03-04 09:46:13] : Yes we can 24
[2018-03-04 09:46:13] : Yes we can 25
[2018-03-04 09:46:13] : Yes we can 26
[2018-03-04 09:46:13] : Yes we can 27
[2018-03-04 09:46:25] : Yes we can 28
[2018-03-04 09:46:25] : Yes we can 29
[2018-03-04 09:46:25] : Yes we can 30
[2018-03-04 09:46:25] : Yes we can 31
[2018-03-04 09:46:26] : Yes we can 32
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 33
[2018-03-04 09:46:26] : Yes we can 34
[2018-03-04 09:46:26] : Yes we can 35
[2018-03-04 09:46:26] : Yes we can 15
[2018-03-04 09:46:26] : Yes we can 14
[2018-03-04 09:46:26] : Yes we can 16
[2018-03-04 09:46:26] : Yes we can 17
[2018-03-04 09:46:27] : Device 1 work size 8 / 32.

Run 3

[2018-03-04 09:47:29] : Start mining: MONERO
[2018-03-04 09:47:29] : Compiling code and initializing GPUs. This will take a while...
[2018-03-04 09:47:29] : Yes we can 1
[2018-03-04 09:47:29] : Yes we can 2
[2018-03-04 09:47:29] : Yes we can 3
[2018-03-04 09:47:29] : Yes we can 4
[2018-03-04 09:47:29] : Yes we can 5
[2018-03-04 09:47:29] : Yes we can 6
[2018-03-04 09:47:29] : Yes we can 7
[2018-03-04 09:47:29] : Yes we can 8
[2018-03-04 09:47:29] : Yes we can 9
[2018-03-04 09:47:29] : Yes we can 9
[2018-03-04 09:47:29] : Yes we can 10
[2018-03-04 09:47:29] : Yes we can 11
[2018-03-04 09:47:29] : Yes we can 12
[2018-03-04 09:47:29] : Yes we can 13
[2018-03-04 09:47:29] : Yes we can 14
[2018-03-04 09:47:29] : Yes we can 16
[2018-03-04 09:47:29] : Yes we can 17
[2018-03-04 09:47:29] : Device 0 work size 8 / 32.
[2018-03-04 09:47:29] : Yes we can 18
[2018-03-04 09:47:29] : Yes we can 18
[2018-03-04 09:47:29] : Yes we can 19
[2018-03-04 09:47:29] : Yes we can 20
[2018-03-04 09:47:29] : Yes we can 21
[2018-03-04 09:47:29] : Yes we can 22
[2018-03-04 09:47:29] : Yes we can 23
[2018-03-04 09:47:29] : Yes we can 24
[2018-03-04 09:47:29] : Yes we can 25
[2018-03-04 09:47:29] : Yes we can 26
[2018-03-04 09:47:29] : Yes we can 27
[2018-03-04 09:47:42] : Yes we can 28
[2018-03-04 09:47:42] : Yes we can 29
[2018-03-04 09:47:42] : Yes we can 30
[2018-03-04 09:47:42] : Yes we can 31
[2018-03-04 09:47:43] : Yes we can 32
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 33
[2018-03-04 09:47:43] : Yes we can 34
[2018-03-04 09:47:43] : Yes we can 35
[2018-03-04 09:47:43] : Yes we can 15
[2018-03-04 09:47:43] : Yes we can 14
[2018-03-04 09:47:43] : Yes we can 16
[2018-03-04 09:47:43] : Yes we can 17
[2018-03-04 09:47:43] : Device 0 work size 8 / 32.
[2018-03-04 09:47:43] : Yes we can 18
[2018-03-04 09:47:43] : Yes we can 18
[2018-03-04 09:47:43] : Yes we can 19
[2018-03-04 09:47:43] : Yes we can 20
[2018-03-04 09:47:43] : Yes we can 21
[2018-03-04 09:47:43] : Yes we can 22
[2018-03-04 09:47:43] : Yes we can 23
[2018-03-04 09:47:43] : Yes we can 24
[2018-03-04 09:47:43] : Yes we can 25
[2018-03-04 09:47:43] : Yes we can 26
[2018-03-04 09:47:43] : Yes we can 27
[2018-03-04 09:47:56] : Yes we can 28
[2018-03-04 09:47:56] : Yes we can 29
[2018-03-04 09:47:56] : Yes we can 30
[2018-03-04 09:47:56] : Yes we can 31
[2018-03-04 09:47:57] : Yes we can 32
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 33
[2018-03-04 09:47:57] : Yes we can 34
[2018-03-04 09:47:57] : Yes we can 35
[2018-03-04 09:47:57] : Yes we can 15
[2018-03-04 09:47:57] : Yes we can 14
[2018-03-04 09:47:57] : Yes we can 16
[2018-03-04 09:47:57] : Yes we can 17
[2018-03-04 09:47:57] : Device 1 work size 8 / 32.
[2018-03-04 09:47:57] : Yes we can 18
[2018-03-04 09:47:57] : Yes we can 18
[2018-03-04 09:47:57] : Yes we can 19
[2018-03-04 09:47:57] : Yes we can 20
[2018-03-04 09:47:57] : Yes we can 21
[2018-03-04 09:47:57] : Yes we can 22
[2018-03-04 09:47:57] : Yes we can 23
[2018-03-04 09:47:57] : Yes we can 24
[2018-03-04 09:47:57] : Yes we can 25
[2018-03-04 09:47:57] : Yes we can 26
[2018-03-04 09:47:57] : Yes we can 27
[2018-03-04 09:48:10] : Yes we can 28
[2018-03-04 09:48:10] : Yes we can 29
[2018-03-04 09:48:10] : Yes we can 30
[2018-03-04 09:48:10] : Yes we can 31
[2018-03-04 09:48:11] : Yes we can 32
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 33
[2018-03-04 09:48:11] : Yes we can 34
[2018-03-04 09:48:11] : Yes we can 35
[2018-03-04 09:48:11] : Yes we can 15
[2018-03-04 09:48:11] : Yes we can 14
[2018-03-04 09:48:11] : Yes we can 16
[2018-03-04 09:48:11] : Yes we can 17
[2018-03-04 09:48:11] : Device 1 work size 8 / 32.
[2018-03-04 09:48:11] : Yes we can 18
[2018-03-04 09:48:11] : Yes we can 18
[2018-03-04 09:48:11] : Yes we can 19
[2018-03-04 09:48:11] : Yes we can 20
[2018-03-04 09:48:11] : Yes we can 21
[2018-03-04 09:48:11] : Yes we can 22
[2018-03-04 09:48:11] : Yes we can 23
[2018-03-04 09:48:11] : Yes we can 24
[2018-03-04 09:48:11] : Yes we can 25
[2018-03-04 09:48:11] : Yes we can 26
[2018-03-04 09:48:11] : Yes we can 27
[2018-03-04 09:48:24] : Yes we can 28
[2018-03-04 09:48:24] : Yes we can 29
[2018-03-04 09:48:24] : Yes we can 30
[2018-03-04 09:48:24] : Yes we can 31
[2018-03-04 09:48:25] : Yes we can 32
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 33
[2018-03-04 09:48:25] : Yes we can 34
[2018-03-04 09:48:25] : Yes we can 35
[2018-03-04 09:48:25] : Yes we can 15
[2018-03-04 09:48:25] : Starting AMD GPU thread 0, no affinity.
[2018-03-04 09:48:25] : Starting AMD GPU thread 1, no affinity.
[2018-03-04 09:48:25] : Starting AMD GPU thread 2, no affinity.
[2018-03-04 09:48:25] : Starting AMD GPU thread 3, no affinity.
nuigru commented 6 years ago

I confirm the issue. I tried to setup a rig with 8 fe and this is very annoying because I have to start and kill 8 times the miner.

psychocrypt commented 6 years ago

I located the broken opencl command but it looks like it is an real driver issue which can not workarounded. One of the api cals hangs and will not return. I checked the opencl documentation many times if we use the command wrong but this is not the case.

pulsarcmc commented 6 years ago

@psychocrypt Which API is causing the hang?

psychocrypt commented 6 years ago

This command https://github.com/psychocrypt/xmr-stak/blob/94853cc0390ac590a4a49d73415dfa807663aa13/xmrstak/backend/amd/amd_gpu/gpu.cpp#L242

nuigru commented 6 years ago

Could you put a watchdog and restart miner automatically, or kill the api and execute a second time? At least in case a FE is detected. This woul be much better than having to restart the miner N times Moreover I noticed that if gpu is reset then at next execution it stucks again... so is difficult to manage by external script.

Thanks

jam3972 commented 6 years ago

hi nuigru, just want to clarify: In my experience after I successfully clear the stall issue by running N times, I can reset my card via device manager disable/enable without bringing the stall back. The stall only comes back when I re-install the drivers for the card (which I am forced to do after a reboot, thanks AMD). Are you saying you get a new stall even after disable/enable? or also only after re-installing the driver?

nuigru commented 6 years ago

Yes, I confirm this. But having to kill xmr 8 times id really bad. Sometime during this the computer crash and I have to redo the annoying resinstall of drivers twice. With 8 fe the blockchain bug that prevents to disable and enable the cards is really annoying. But without blockchain driver and trick to disable and reenable you can get hash rate lower than a normal 56.... really bad Possible that there is no other solution to this?

jam3972 commented 6 years ago

FYI when using the newest Adrenaline Driver 18.3.4 This issue no longer happens and you can still maintain the same hashes through reboots. (though you have to use soft PowerPlay tables instead of OverdriveNtool) So this 100% driver issue has been resolved.