ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support
GNU General Public License v3.0
5.97k stars 2.28k forks source link

Nothing gets done with AMD and MESA #2234

Open hasezoey opened 3 years ago

hasezoey commented 3 years ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Compile with provided options on provided system
  2. Start ethminer
  3. Nothing gets done

Expected behavior To actually do & submit jobs

Screenshots (Optional) (Note: i let it run for ~10min after this is shown) Screenshot from 2021-03-22 12-25-26 radeontop Screenshot from 2021-03-22 12-32-30

Environment (please complete the following information):

Additional context how i watched that nothing gets done:

PS: release 0.19 isnt an actual release here in github, and any release before that just errors with error: implicit declaration of function 'amd_bitalign' is invalid in OpenCL - so i had to compile from source

PPS: after trying to terminate the process with CTRL+C it says its shutting down, but there also nothing gets done, so i had to kill with SIGKILL - and VRAM dosnt get cleared

also i dont know if this has anything to do with that, this is my only gpu in the system, and so also used to display stuff

DeeDeeRanged commented 3 years ago

Download the latest drivers from AMD untar it in a directory > cd to that directory nand run amdgpu-install --opencl=legacy --headless --no-dkms This will install the amdgpu pro opecl drivers as they are needed as mesa opencl is not sufficient.

hasezoey commented 3 years ago

@DeeDeeRanged from what i could tell searching around (because i didnt work with opencl / cuda before) is that as long as clinfo shows everything, then opencl is correctly installed and from what i could tell in the documentation / output of this project is that mesa / clover is enough to at least run it (even when not performant) -> or is my GPU in combination with mesa / clover just not working? which GPU would i need that mesa / clover would work (for future reference)

hasezoey commented 3 years ago

now also tried amdgpu-pro-20.50-1234664-ubuntu-20.04 with command amdgpu-install --opencl=legacy --headless --no-dkms, re-compiled ethminer, but same result packages:

$ apt list --installed "*amdgpu*"
amdgpu-core/unknown,now 20.50-1234664 all [installed,automatic]
amdgpu-pin/unknown,now 20.50-1234664 all [installed]
amdgpu-pro-core/unknown,now 20.50-1234664 all [installed,automatic]
clinfo-amdgpu-pro/unknown,now 20.50-1234664 amd64 [installed]
libdrm-amdgpu-amdgpu1/unknown,now 1:2.4.100-1234664 amd64 [installed,automatic]
libdrm-amdgpu-common/unknown,now 1.0.0-1234664 all [installed,automatic]
libdrm-amdgpu1/focal-updates,now 2.4.102-1ubuntu1~20.04.1 amd64 [installed]
libdrm-amdgpu1/focal-updates,now 2.4.102-1ubuntu1~20.04.1 i386 [installed,automatic]
libdrm2-amdgpu/unknown,now 1:2.4.100-1234664 amd64 [installed,automatic]
ocl-icd-libopencl1-amdgpu-pro/unknown,now 20.50-1234664 amd64 [installed,automatic]
opencl-orca-amdgpu-pro-icd/unknown,now 20.50-1234664 amd64 [installed]
xserver-xorg-video-amdgpu/focal,now 19.1.0-1 amd64 [installed]

clinfo:

$ clinfo
Number of platforms                               2
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 20.2.6
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3224.4)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-67-generic, LLVM 11.0.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 20.2.6
  Driver Version                                  20.2.6
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Max compute units                               64
  Max clock frequency                             1630MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              8589934592 (8GiB)
  Error Correction support                        No
  Max memory allocation                           6871947673 (6.4GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       32768 bits (4096 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max number of constant args                     16
  Max constant buffer size                        2147483392 (2GiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-67-generic, LLVM 11.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-67-generic, LLVM 11.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-67-generic, LLVM 11.0.0)

PS: after uninstalling mesa-opencl-icd, ethminer does not detect any device and clinfo fails with dlerror: libMesaOpenCL.so.1: cannot open shared object file: No such file or directory and says Number of devices 0

hasezoey commented 3 years ago

i could now successfully install rocm (but not amdgpu-pro any variants), and using /opt/rocm/opencl/bin/clinfo gets me the output for devices:

$ sudo /opt/rocm/opencl/bin/clinfo
[sudo] password for hasezoey:           
Number of platforms:                 2
  Platform Profile:              FULL_PROFILE
  Platform Version:              OpenCL 1.1 Mesa 20.2.6
  Platform Name:                 Clover
  Platform Vendor:               Mesa
  Platform Extensions:               cl_khr_icd
  Platform Profile:              FULL_PROFILE
  Platform Version:              OpenCL 2.0 AMD-APP (3241.0)
  Platform Name:                 AMD Accelerated Parallel Processing
  Platform Vendor:               Advanced Micro Devices, Inc.
  Platform Extensions:               cl_khr_icd cl_amd_event_callback 

  Platform Name:                 Clover
Number of devices:               1
  Device Type:                   CL_DEVICE_TYPE_GPU
  Vendor ID:                     1002h
  Max compute units:                 64
  Max work items dimensions:             3
    Max work items[0]:               256
    Max work items[1]:               256
    Max work items[2]:               256
  Max work group size:               256
  Preferred vector width char:           16
  Preferred vector width short:          8
  Preferred vector width int:            4
  Preferred vector width long:           2
  Preferred vector width float:          4
  Preferred vector width double:         2
  Native vector width char:          16
  Native vector width short:             8
  Native vector width int:           4
  Native vector width long:          2
  Native vector width float:             4
  Native vector width double:            2
  Max clock frequency:               1630Mhz
  Address bits:                  64
  Max memory allocation:             6871947673
  Image support:                 No
  Max size of kernel argument:           1024
  Alignment (bits) of base address:      32768
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                     No
    Quiet NaNs:                  Yes
    Round to nearest even:           Yes
    Round to zero:               No
    Round to +ve and infinity:           No
    IEEE754-2008 fused multiply-add:         No
  Cache type:                    None
  Cache line size:               0
  Cache size:                    0
  Global memory size:                8589934592
  Constant buffer size:              2147483392
  Max number of constant args:           16
  Local memory type:                 Scratchpad
  Local memory size:                 32768
  Kernel Preferred work group size multiple:     64
  Error correction support:          0
  Unified memory for Host and Device:        0
  Profiling timer resolution:            0
  Device endianess:              Little
  Available:                     Yes
  Compiler available:                Yes
  Execution capabilities:                
    Execute OpenCL kernels:          Yes
    Execute native function:             No
  Queue on Host properties:              
    Out-of-Order:                No
    Profiling :                  Yes
  Platform ID:                   0x7fcf8a786b60
  Name:                      Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-70-generic, LLVM 11.0.0)
  Vendor:                    AMD
  Device OpenCL C version:           OpenCL C 1.1 
  Driver version:                20.2.6
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 1.1 Mesa 20.2.6
  Extensions:                    cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

  Platform Name:                 AMD Accelerated Parallel Processing
Number of devices:               1
  Device Type:                   CL_DEVICE_TYPE_GPU
  Vendor ID:                     1002h
  Board name:                    Vega 10 XL/XT [Radeon RX Vega 56/64]
  Device Topology:               PCI[ B#3, D#0, F#0 ]
  Max compute units:                 64
  Max work items dimensions:             3
    Max work items[0]:               1024
    Max work items[1]:               1024
    Max work items[2]:               1024
  Max work group size:               256
  Preferred vector width char:           4
  Preferred vector width short:          2
  Preferred vector width int:            1
  Preferred vector width long:           1
  Preferred vector width float:          1
  Preferred vector width double:         1
  Native vector width char:          4
  Native vector width short:             2
  Native vector width int:           1
  Native vector width long:          1
  Native vector width float:             1
  Native vector width double:            1
  Max clock frequency:               1630Mhz
  Address bits:                  64
  Max memory allocation:             7287183768
  Image support:                 Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      8
  Max image 2D width:                16384
  Max image 2D height:               16384
  Max image 3D width:                16384
  Max image 3D height:               16384
  Max image 3D depth:                8192
  Max samplers within kernel:            26751
  Max size of kernel argument:           1024
  Alignment (bits) of base address:      1024
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                     Yes
    Quiet NaNs:                  Yes
    Round to nearest even:           Yes
    Round to zero:               Yes
    Round to +ve and infinity:           Yes
    IEEE754-2008 fused multiply-add:         Yes
  Cache type:                    Read/Write
  Cache line size:               64
  Cache size:                    16384
  Global memory size:                8573157376
  Constant buffer size:              7287183768
  Max number of constant args:           8
  Local memory type:                 Scratchpad
  Local memory size:                 65536
  Max pipe arguments:                16
  Max pipe active reservations:          16
  Max pipe packet size:              2992216472
  Max global variable size:          7287183768
  Max global variable preferred total size:  8573157376
  Max read/write image args:             64
  Max on device events:              1024
  Queue on device max size:          8388608
  Max on device queues:              1
  Queue on device preferred size:        262144
  SVM capabilities:              
    Coarse grain buffer:             Yes
    Fine grain buffer:               Yes
    Fine grain system:               No
    Atomics:                     No
  Preferred platform atomic alignment:       0
  Preferred global atomic alignment:         0
  Preferred local atomic alignment:      0
  Kernel Preferred work group size multiple:     64
  Error correction support:          0
  Unified memory for Host and Device:        0
  Profiling timer resolution:            1
  Device endianess:              Little
  Available:                     Yes
  Compiler available:                Yes
  Execution capabilities:                
    Execute OpenCL kernels:          Yes
    Execute native function:             No
  Queue on Host properties:              
    Out-of-Order:                No
    Profiling :                  Yes
  Queue on Device properties:                
    Out-of-Order:                Yes
    Profiling :                  Yes
  Platform ID:                   0x7fcf8224ed10
  Name:                      gfx900:xnack-
  Vendor:                    Advanced Micro Devices, Inc.
  Device OpenCL C version:           OpenCL C 2.0 
  Driver version:                3241.0 (HSA1.1,LC)
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 2.0 
  Extensions:                    cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

Note: it only shows OpenCL 2.0 when using sudo, not without it so i tried running ethminer with and without sudo, but both times same result as before, Using Device : Radeon RX Vega (VEGA10, DRM 3.35.0, 5.4.0-70-generic, LLVM 11.0.0) OpenCL 1.1 Mesa 20.2.6 Memory : 8.00 GB (8589934592 B)

i also tried uninstalling mesa-opencl-icd, where sudo /opt/rocm/opencl/bin/clinfo will only show the OpenCL 2.0 device, but ethminer dosnt see any device (with and without sudo)

DeeDeeRanged commented 3 years ago

The only thing I did to get my RX 580 working on Debian testing was I downloaded the AMD driver package and installed it amdgpy-install --opencl=legacy --headless --no-dkms as my kernel is 5.10. Was not able to get ethminer to work with my RX 580 so I downloaded teamredminer (only for AMD) and that works like a charm but you can also use gminer as that also works great with NVIDIA and AMD I am now using that last one now as it will kick of both my cards (NVIDIA and AMD).

hasezoey commented 3 years ago

@DeeDeeRanged i just tested teamredminer with current rocm install, but in both sudo and non-sudo it also dosnt detect any opencl (not mesa, unsupported, and rocm not detected)

DeeDeeRanged commented 3 years ago

As I am running kernel 5.10 I cannot instaall rocr as it will try to install amdgpu-dkms and that only works with kernel 5.9 or lower and rocr was only introduced recently with the latest AMD drivers. I have the following installed: sudo dpkg --get-selections *xserver-xorg-video-amdgpu xserver-xorg-video-amdgpu install

sudo dpkg --get-selections firmware-amd-graphics firmware-amd-graphics install

dpkg --get-selections mesa glx-alternative-mesa install libegl-mesa0:amd64 install libegl-mesa0:i386 install libgl1-mesa-dri:amd64 install libgl1-mesa-dri:i386 install libglapi-mesa:amd64 install libglapi-mesa:i386 install libglu1-mesa:amd64 install libglu1-mesa:i386 install libglx-mesa0:amd64 install libglx-mesa0:i386 install libosmesa6:amd64 install libosmesa6:i386 install mesa-opencl-icd:amd64 install mesa-utils install mesa-va-drivers:amd64 install mesa-vdpau-drivers:amd64 install mesa-vulkan-drivers:amd64 install mesa-vulkan-drivers:i386 install

The following packages are installed with amdgpu-install --opencl=legacy --headless --no-dkms amdgpu-core amdgpu-pin amdgpu-pro-core clinfo-amdgpu-pro libdrm-amdgpu-amdgpu1 libdrm-amdgpu-common libdrm2-amdgpu ocl-icd-libopencl1-amdgpu-pro opencl-orca-amdgpu-pro-icd

My line for teamred v0.81-linux is: teamredminer -a ethash -o stratum+ssl://eu1.ethermine.org:5555 -u --eth_dag_buf=S

hasezoey commented 3 years ago

for anyone interested, i just completely uninstall rocm again and installed (only) rocm-opencl4.1.0 and rocm-opencl-dev4.1.0 and then rebooted, at first there was no change but after adding sudo and LD_LIBRARY_PATH=/opt/rocm/opencl/lib it finally worked without error (at an ~34Mh)

final command: sudo LD_LIBRARY_PATH=/opt/rocm/opencl/lib ./ethminer/ethminer -G -P "yournodeurl" (tested with ethminer 0.18.0 and ethminer 0.19.0+commit.ce52c740 [options: -DETHASHCL=ON -DETHASHCUDA=OFF -DETHASHCPU=OFF -DETHDBUS=OFF -DAPICORE=ON -DBINKERN=ON -DDEVBUILD=OFF -DUSE_SYS_OPENCL=OFF, with ~4Mh improvement when using 0.19+ over 0.18])

Update: i could remove using sudo by being in the group render, forgot this group existed

DeeDeeRanged commented 3 years ago

Where did you find rocm-opencl4.1.0 and rocm-opencl-dev4.1.0 ? Cannot find it in the recent AMD driver package.

hasezoey commented 3 years ago

@DeeDeeRanged it can be that it also works with the non 4.1.0 package, but i just wanted to have the latest version of it installed

official rocm repo https://repo.radeon.com/rocm/apt/debian/ (for debian an debian based systems (apt))

DeeDeeRanged commented 3 years ago

@hasezoey Thanks I found it, don't know if it will have any benefit for my RX 580. Did you get improvements in hashrate with your Vega 64? Also did you need to install the dev package?

hasezoey commented 3 years ago

Did you get improvements in hashrate

rocm is the only opencl i could get to work (aside from mesa), which improved the hashrate from 0h to ~34Mh (~30Mh on 0.18, ~34Mh on 0.19 (self compiled)) (tested was mesa,amdgpu-pro:legacy,amdgpu-pro:pal,amdgpu-pro:rocr,amdgpu-pro:rocm,rocm and only rocm (non amdgpu-pro) worked for me)

Also did you need to install the dev package?

what dev package? if you mean rocm-opencl-dev4.1.0, yes i said i have that installed

DeeDeeRanged commented 3 years ago

@hasezoey I will have a try with the rocm packages ;)