Open liam-b opened 6 years ago
Run as root, or add user to video
group and also check perms on /dev/dri to make sure the group is video and has perms:
# ls -l /dev/dri/
crw-rw---- 1 root video 226, 0 Oct 22 09:51 card0
I usually give up about five levels deep and run as root (with AMD)
Running as root has the same results but I will check /dev/dri and user groups when I get home.
$ sudo su
$ ./xmr-stak
-------------------------------------------------------------------
xmr-stak 2.5.1 4e72408
Brought to you by fireice_uk and psychocrypt under GPLv3.
Based on CPU mining code by wolf9466 (heavily optimized by fireice_uk).
Based on OpenCL mining code by wolf9466.
Configurable dev donation level is set to 1.0%
-------------------------------------------------------------------
You can use following keys to display reports:
'h' - hashrate
'r' - results
'c' - connection
-------------------------------------------------------------------
Upcoming xmr-stak-gui is sponsored by:
##### ______ ____
## ## | ___ \ / _ \
# _ #| |_/ /_ _ ___ | / \/ _ _ _ _ _ _ ___ _ __ ___ _ _
# |_| #| /| | | | / _ \ | | | | | || '_|| '_|/ _ \| '_ \ / __|| | | |
# #| |\ \| |_| || (_) || \_/\| |_| || | | | | __/| | | || (__ | |_| |
## ## \_| \_|\__, | \___/ \____/ \__,_||_| |_| \___||_| |_| \___| \__, |
##### __/ | __/ |
|___/ https://ryo-currency.com |___/
This currency is a way for us to implement the ideas that we were unable to in
Monero. See https://github.com/fireice-uk/cryptonote-speedup-demo for details.
-------------------------------------------------------------------
[2018-10-24 09:08:42] : Mining coin: monero
[2018-10-24 09:08:42] : WARNING: UNKNOWN_ERROR when calling clGetPlatformIDs for number of platforms.
[2018-10-24 09:08:42] : WARNING: No OpenCL platform found.
[2018-10-24 09:08:42] : WARNING: No AMD OpenCL platform found. Possible driver issues or wrong vendor driver.
[2018-10-24 09:08:42] : WARNING: backend AMD (OpenCL) disabled.
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 0.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 1.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 2.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 3.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 4.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Starting 1x thread, affinity: 5.
[2018-10-24 09:08:42] : hwloc: memory pinned
[2018-10-24 09:08:42] : Fast-connecting to pool.supportxmr.com:5555 pool ...
[2018-10-24 09:08:42] : Pool pool.supportxmr.com:5555 connected. Logging in...
[2018-10-24 09:08:42] : Difficulty changed. Now: 10000.
[2018-10-24 09:08:42] : Pool logged in.
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
[2018-10-24 09:08:42] : Switch to assembler version for 'intel_avx' cpu's
clinfo
should work as whatever user, if the permissions are right, but it is odd that clinfo is OK and opencl applets don't work. So, it could be getting denied by AppArmor or one of the other 7 layers of extra permissions bureaucracy CentOS has turned on by default. More complex than getting the device nodes to be workable (which used to be all that could block you). I know nothing of apparmor other than it gets shut off or I switch to a distro without it forced upon me.
Maybe something like this
drm.rnodes=1
is fairly universal being in the generic DRM module (not intel specific)
You should be able to make a config file in the /etc/modprobe* neighborhood with option drm rnodes=1
in it aside from using the kernel args and rebooting. but you have to get the whole stack to modprobe -r
which can end up nearly impossible depending if console is on it. So setup and reboot and see if there are crw-rw---- 1 root video 226, 128 Oct 22 09:51 renderD128
type nodes in /dev/dri/
Result of ls -l /dev/dri/
is
drwxr-xr-x 2 root root 80 Oct 24 20:20 ./
drwxr-xr-x 20 root root 3600 Oct 24 20:20 ../
crw-rw----+ 1 root video 226, 0 Oct 25 01:26 card0
crw-rw----+ 1 root video 226, 128 Oct 24 20:20 renderD128
and I ran sudo usermod -a -G video nbrennan
. And now clinfo
works without running as root! Still same error when running xmr-stak
:/
Ok, so I decided to take a different approach to this problem and spent about 2 hours installing ROCm and guess what, exactly the same error when running xmr-stak :( In fact, as a bonus now clinfo
returns:
terminate called after throwing an instance of 'cl::Error'
what(): clGetPlatformIDs
Aborted
so I'm feeling a little lost. Also can you explain what you mean by setting drm.rnodes=1
For my AMD GPUs I always start xmr-stak via a start script, may be of help?
!/bin/bash
export GPU_FORCE_64BIT_PTR=1 export GPU_USE_SYNC_OBJECTS=1 export GPU_MAX_ALLOC_PERCENT=100 export GPU_SINGLE_ALLOC_PERCENT=100 export GPU_MAX_HEAP_SIZE=100 ./xmr-stak
which rocm version do you installed? You need rocm 1.9
The rnodes
module option is probably not needed as you already have the render
node in /dev/dri
If you were not running Xorg then it may not have enabled the render (just the card entry) which I think breaks the shared memory usage
Still voting on some CentOS specific added protection layer, protecting you from yourself (video device access permissions besides the filesystem-level permissions to the dev nodes - apparmor probably - you can/have-to make an apparmor profile for the app with some of the apparmor profile tools to tell apparmor subsystem it's ok to not block it from the GPU)
Basically the same annoying junk Windows does except no popup and no simple "allow button"
AppArmor is the only thing that can also block root, is why I stand so hard on some of that junk (there are some other things besides apparmor, in the way, on CentOS by default, but I don't remember them all I always just go back to Debian where they haven't gone paranoid and I can turn ON the things I want, not wonder/research how many things I might have to turn OFF...)
@psychocrypt I just followed the instructions on https://rocm.github.io/ROCmInstall.html, so I'm pretty sure it is 1.9. Should I try different versions of ROCm? Or maybe different amd drivers? Or is that not likely to help.
@Spudz76 would I need to do something like this to disable AppArmor? Also could SELinux be breaking things as well and should I disable it?
That's on the right track, but Plesk is Debian based so they suggest commands with dpkg
and such (you use yum
... package names are different... etc)
This is more RedHat centric and covers both SELinux and AppArmor - SELinux was the other layer I couldn't remember the name of (probably from PTSD... dealing with it before). You can generally use any tutorials for RHEL, CentOS, Fedora they are all RedHat based and similar.
Ok, little update. I've uninstalled ROCm because I don't think it was helping and it broke clinfo
. Now I'm back on amd drivers (amdgpu-pro-18.40-676022-rhel-7.4
specifically) and clinfo
is working without root again which is nice. Still no luck with xmr-stak, same old error. getenforce
returns Disabled
and /etc/selinux/config
looks like:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
(This was already like this, I didn't change anything SELinux related)
So basically no progress :(. Is it possible I could run xmr-stak in a vm/docker container? Would that be a plausible way get this thing working? Anyway, thanks for all the help so far @Spudz76, I really appreciate your effort.
I just noticed you had run cmake vs cmake3 at some point. That may have left your build folder in a strange state?
I generally ignore whatever CMake comes from the Linux Distro (always old) or even the devtoolset equivalents (also old even when 3.x) and grab the install script from CMake directly and install that. https://cmake.org/files/v3.12/cmake-3.12.3-Linux-x86_64.sh
wget https://cmake.org/files/v3.12/cmake-3.12.3-Linux-x86_64.sh
sh cmake-3.12.3-Linux-x86_64.sh --prefix=/usr
[answer n to install to /usr/]
cmake --version
Purge any and all cmake*
from yum first
I did what you said for cmake with no luck, same error. (I ran sudo yum remove cmake
and sudo yum remove cmake3
before installing)
But, what I did try was making a little cpp file to test clGetPlatformIDs:
#include <CL/cl.h>
#include <iostream>
using namespace std;
int main() {
cout << "hello, world" << endl;
cl_int error;
cl_platform_id platforms;
cl_uint num_platforms;
error = clGetPlatformIDs(1, &platforms, &num_platforms);
cout << error << endl;
cout << num_platforms << endl;
return 0;
}
What I got was 0 platforms found and the error code -1001 which if you look up in the OpenCL errors is CL_PLATFORM_NOT_FOUND_KHR
: No valid ICDs found. I tried sudo yum install ocl-icd
but it didn't help so I'm wondering what I need to get an icd working.
Also clinfo
(no sudo) is still working fine and returns:
$ clinfo
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (2686.5)
... etc
so the driver is fine.
Perhaps it is the permissions to the ICD configs which should be in /etc/OpenCL/
Check if yours are all accessible+readable such as on this Debian Sid system.
# ls -lR /etc/OpenCL
/etc/OpenCL:
total 4
drwxr-xr-x 2 root root 4096 Sep 23 11:46 vendors
/etc/OpenCL/vendors:
total 4
-rw-r--r-- 1 root root 15 Jun 14 20:19 amdocl64.icd
# cat /etc/OpenCL/vendors/amdocl64.icd
libamdocl64.so
# updatedb && locate libamdocl64.so
/opt/amdgpu-pro/lib/x86_64-linux-gnu/libamdocl64.so
# ls -l /opt/amdgpu-pro/lib/x86_64-linux-gnu/libamdocl64.so
-rw-r--r-- 1 root root 67947032 Mar 5 2018 /opt/amdgpu-pro/lib/x86_64-linux-gnu/libamdocl64.so
All that should line up and be everyone-readable and directories everyone-executable.
Perhaps the /opt/amdgpu-pro/
tree is not everyone-open so it can't load the library file (once it makes it through all the other mapping)
Sometimes strace
helps see what library/file/permission is failing, if you are able to see through all the noise it outputs. But usually the failure is near the end.
Also, the error code is indeed not handled and would default out and show up as UNKNOWN_ERROR
CL_PLATFORM_NOT_FOUND_KHR
aka -1001
should be added probably, for better error message
Basic information
Type of the CPU
Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
Type of the GPU
Gigabyte Radeon RX 580 8g
Compile issues
Which OS do you use?
CentOS 7.4.1708 and Linux version 3.10.0-693.5.2.el7.x86_64
Issue with the execution
Did you compile the miner on your own? Yes
AMD OpenCl issue
Driver version
amdgpu-pro-18.20-606296