fireice-uk / xmr-stak-amd

Monero AMD miner
GNU General Public License v3.0
193 stars 114 forks source link

Miner hangs in current dev branch on multiple cards #80

Open lhilton opened 7 years ago

lhilton commented 7 years ago

ASRock H81 Pro BTC R2.0 with Intel Celeron G1840 and PCIe risers

Fresh install of Ubuntu 16.04.03, did a dist-upgrade and update. Installed the AMD 17.30 version drivers, then followed the build instructions. I am using the dev branch. On a single card it will mine. On multiple cards it will not. It hangs with this as the only output:

leezilla@miner-01:~/src/xmr-stak-amd-dev-20170825/bin$ ./mine.sh 
[2017-08-25 20:57:40] : Compiling code and initializing GPUs. This will take a while...
[2017-08-25 20:57:40] : Device 0 work size 10 / 256.
[2017-08-25 20:57:45] : Device 1 work size 10 / 256.

No errors in dmesg. I cannot CTRL-C out. I cannot kill the PID. Shutdown/reboot hangs while waiting the obligatory 90 seconds for the thing to die (it never does) before forcing it down.

I have tried matching and non-matching cards of the following (up to 6x of matching for each):

R9 270 R9 290 R9 370 R9 380

My current test, the one from this specific copy/paste of output and config is using slot 0 and 1 of the motherboard with a R9 370 and R9 380 (in that order).

My config.txt is as follows:

"gpu_thread_num" : 2,

"gpu_threads_conf" : [
  { "index" : 0, "intensity" : 1000, "worksize" : 10, "affine_to_cpu" : false },
  { "index" : 1, "intensity" : 1000, "worksize" : 10, "affine_to_cpu" : false }
],

"platform_index" : 0,

"use_tls" : false,
"tls_secure_algo" : true,
"tls_fingerprint" : "",

"pool_address" : "monerohash.com:3333",
"wallet_address" : "4BFL692QZ3Wbwe5JKKKGc1KAH7MpKytKi3x3PKsZXm8hJ4azfT78zUbZq5mGVSK8XUdABkA5JzQRYaxeru5A6wneDx1XbQo",
"pool_password" : "x",

"call_timeout" : 10,
"retry_time" : 10,
"giveup_limit" : 0,

"verbose_level" : 4,

"h_print_time" : 60,

"daemon_mode" : true,

"output_file" : "/home/leezilla/logs/xmr-stak-amd.log",

"httpd_port" : 16001,

"prefer_ipv4" : true,

And I am kicking this all of with mine.sh as follows:

#!/bin/bash
export GPU_FORCE_64BIT_PTR=1
export GPU_USE_SYNC_OBJECTS=1
export GPU_MAX_ALLOC_PERCENT=100
export GPU_SINGLE_ALLOC_PERCENT=100
export GPU_MAX_HEAP_SIZE=100
./xmr-stak-amd

I am not sure how to diagnose further. Other miners have worked with multiple cards on this machine.

psychocrypt commented 7 years ago

please reduce the intensity to 756 and the worksize to 8. Is this solving the issue?

lhilton commented 7 years ago

Same issue, no changes in the results or messaging. I also tried going 500 with 8 and 4 worksize and 256 with 8 and 4 worksiz for giggles and same results there.

fractalyse commented 7 years ago

Hi, Same issue here, but under Gentoo. Can't get it working, even if I use only one card. Software stucks on device 0.

i7 7820x, 2x RX 580, no risers. Works under windows with xmr-stak-amd

Nathilion commented 7 years ago

I am recently experiencing a similar issue both with xmr-stak-amd version 1.0.0 and the most recent version. I am running this on a Windows 7 miner (I had AMD driver issues on Ubuntu) with an old driver (v16.12) and with a R9 280x and an HD 7770. I have been using this system without a snag since January this year.

When the program hangs it still reacts to ctrl^c so I can terminate it. My batch job then restarts the program but every time the 280x fails to start properly (works <10 seconds and then nothing) but the 7770 generates hashes as normal. It's not until after a reboot that every works as it should again.

As this is a very recent issue (it only started as of last week) I figured it had to do with something changing in Windows itself. Poring over the windows logs I noticed that the hangs coincided with the following event: A request to disable the Desktop Window Manager was made by process (4) As far I can tell this is usually caused by either DWM or the video driver crashing so that I reboot is required. On Windows 7 DWM is used by the Aero theme (mostly) for the fancy graphics. Now I had turned Aero off to free up mining resources back in January so I didn't take note of the theme I was looking at when going over the logs. But guess what? The theme had been reset to the default Windows 7 theme! My guess is that it must've been one of the security patches from Windows Update. So I disabled Aero again and on top of that all visual effects and DWM itself. Now before realising this I had also updated my graphics driver. So if I don't experience any more freezes I won't be able to tell if its the driver or disabling Aero that did the trick. Oops, not very scientific of me.

Anyhow, the freezes happen frequently enough (1 or more times a day) that I should be able to give a status update soon.

Update:

It's been two days now since I disabled the DWM service and turned off all the fancy graphics and I haven't had any problems with the miner freezing. So, to all our fellow Windows miners I would recommend disabling every graphics feature the OS has.

Update 2:

It's been seven days now and the hangs have returned. The miner hangs once again after 6-12 hours of running. After this a reboot of the system is required to get everything back up and running.

ITwrx commented 6 years ago

just as another data point, i was getting a segfault everytime with ubuntu 16.04 and amdgpu pro 16.40. i upgraded to ubuntu 17.04 and amdgpu pro 17.30 and xmr-stak-amd starts up fine now. for some reason opencl now requires sudo on the machine in question, so xmr-stak-amd has to be run with sudo too so it can see the opencl platform+cards.