ROCm / ROCm-OpenCL-Runtime

ROCm OpenOpenCL Runtime
168 stars 55 forks source link

Compiling cryptonight fails #29

Open tacchinotacchi opened 7 years ago

tacchinotacchi commented 7 years ago

Not sure what repo to put the issue in, but compiler fails when trying to compile the cryptonight algoritm of sgminer ( https://github.com/genesismining/sgminer-gm ) trace: `Target: amdgcn-amd-amdhsa-opencl Thread model: posix InstalledDir: /opt/rocm/opencl/bin/x86_64 [08:16:42] Error -11: Building Program (clBuildProgram)
[08:16:42] warning: argument unused during compilation: '-I .' warning: argument unused during compilation: '-I ./kernel' warning: argument unused during compilation: '-I .' warning: argument unused during compilation: '-I /usr/local/bin' error: unable to execute command: Segmentation fault error: clang frontend command failed due to signal (use -v to see invocation) note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script. note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs. /opt/rocm/opencl/bin/x86_64/clang[0x223cbca] /opt/rocm/opencl/bin/x86_64/clang[0x223af5e] /opt/rocm/opencl/bin/x86_64/clang[0x223b0b0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x110c0)[0x7facea6860c0] /opt/rocm/opencl/bin/x86_64/clang[0x1448e94] /opt/rocm/opencl/bin/x86_64/clang[0x1429b81] /opt/rocm/opencl/bin/x86_64/clang[0x17d2677] /opt/rocm/opencl/bin/x86_64/clang[0x218586a] /opt/rocm/opencl/bin/x86_64/clang[0x2185903] /opt/rocm/opencl/bin/x86_64/clang[0x21862ff] /opt/rocm/opencl/bin/x86_64/clang[0x58f356] /opt/rocm/opencl/bin/x86_64/clang[0x5917d3] /opt/rocm/opencl/bin/x86_64/clang[0x56da79] /opt/rocm/opencl/bin/x86_64/clang[0x90093e] /opt/rocm/opencl/bin/x86_64/clang[0x8d345d] /opt/rocm/opencl/bin/x86_64/clang[0x568e5d] /opt/rocm/opencl/bin/x86_64/clang[0x565dc8] /opt/rocm/opencl/bin/x86_64/clang[0x5189d9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7facea2f62b1] /opt/rocm/opencl/bin/x86_64/clang[0x55fde1] Stack dump:

  1. Program arguments: /opt/rocm/opencl/bin/x86_64/clang -cc1 -triple amdgcn-amd-amdhsa-opencl -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name t_10859_43.bc -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -target-cpu fiji -dwarf-column-info -debugger-tuning=gdb -resource-dir /opt/rocm/opencl/bin/lib/clang/4.0 -O3 -fdebug-compilation-dir /home/alex/mine/sgminer-gm -ferror-limit 19 -fmessage-length 100 -cl-kernel-arg-info -fobjc-runtime=gcc -fdiagnostics-show-option -vectorize-loops -vectorize-slp -mllvm -amdgpu-internalize-symbols -mllvm -amdgpu-early-inline-all -o /tmp/t_10859_43-64ee72.o -x ir /tmp/AMD_10859_30/t_10859_43.bc
  2. Code generation
  3. Running pass 'Function Pass Manager' on module '/tmp/AMD_10859_30/t_10859_43.bc'.
  4. Running pass 'SI Fix SGPR copies' on function '@search' Error: Creating the executable failed: Compiling LLVM IRs to executable ` Does this look like it's about rocm or llvm in general?
acmeman925 commented 7 years ago

When using the driver from https://github.com/RadeonOpenCompute/ROCm#to-install-rocm-with-developer-preview-of-opencl, we were not able to observe the issue. Could you please try it and let us know if there are any issues.

calvintam236 commented 7 years ago

I can confirm this issue. I tried to run sgminer-gm with Vega 56 on Ubuntu 16.04.3 LTS.

Here is the log.

$ ./sgminer_ubuntu64 -k cryptonight -o xmr.poolmining.org:3033 -u USERNAME --gpu-platform 0 --intensity 31 --no-adl --log 60 --text-only --debug
[02:38:15] Started sgminer 5.5.5-gm-a
[02:38:15] * using Jansson 2.7
[02:38:16] CL Platform vendor: Advanced Micro Devices, Inc.
[02:38:16] CL Platform name: AMD Accelerated Parallel Processing
[02:38:16] CL Platform version: OpenCL 2.0 AMD-APP (2450.0)
[02:38:16] Platform devices: 1
[02:38:16]      0       gfx900
[02:38:16] GPU0: detected PCIe topology 0000:25:00.0
[02:38:16] Default Devices = all
[02:38:16] set_devices(all)
[02:38:16] Loading settings from default_profile for pool 0
[02:38:16] Pool 0 Algorithm set to "cryptonight"
[02:38:16] Pool 0 devices set to "all"
[02:38:16] Pool 0 lookup gap set to "(null)"
[02:38:16] Pool 0 Intensity set to "31"
[02:38:16] Pool 0 Thread Concurrency set to "(null)"
[02:38:16] Pool 0 GPU Clock set to "(null)"
[02:38:16] Pool 0 GPU Memory clock set to "(null)"
[02:38:16] Pool 0 GPU Threads set to "(null)"
[02:38:16] Pool 0 GPU Fan set to "(null)"
[02:38:16] Pool 0 GPU Powertune set to "(null)"
[02:38:16] Pool 0 GPU Vddc set to "(null)"
[02:38:16] Pool 0 Shaders set to "(null)"
[02:38:16] Pool 0 Worksize set to "(null)"
[02:38:16] WARNING: GPU_MAX_ALLOC_PERCENT is not specified!
[02:38:16] WARNING: GPU_USE_SYNC_OBJECTS is not specified!
[02:38:16] Trying to set current pool...
[02:38:16] Probing for an alive pool
[02:38:16] Testing xmr.poolmining.org
[02:38:16] Probing for GBT support
[02:38:17] HTTP request failed: Empty reply from server
[02:38:17] Failed to connect in json_rpc_call
[02:38:17] No GBT coinbase + append support found, using getwork protocol
[02:38:17] HTTP request failed: Empty reply from server
[02:38:17] Failed to connect in json_rpc_call
[02:38:18] Succeeded delayed connect
[02:38:18] Closing xmr.poolmining.org socket
[02:38:18] Succeeded delayed connect
[02:38:19] XMR AuthID: 0HL8I8BAL3FCJ
[02:38:19] parse_notify_cn()
[02:38:19] Stratum authorisation success for xmr.poolmining.org
[02:38:19] [THR0] gen_stratum_work_cn() - algorithm = cryptonight
[02:38:19] gen_stratum_work_cn() done.
[02:38:19] [THR0] Pushing work from xmr.poolmining.org to hash queue
[02:38:19] New block: 40f459e4ff6fdad68fbf12dbf9068c0b9dbb289e5f4d12c7e26eee8385
cf05a2... diff 8.83T
[02:38:19] Trying to set current pool...
[02:38:19] xmr.poolmining.org alive
[02:38:19] Trying to set current pool...
[02:38:19] Startup GPU initialization... Using settings from pool xmr.poolmining.org.
[02:38:19] Startup Pool No = 0
[02:38:19] compare_pool_settings()
[02:38:19] set_devices(all)
[02:38:19] Switching to intensity: pool = 31, default = 31
[02:38:19] intensity -> 31
[02:38:19] Set GPU 0 to cryptonight
[02:38:19] Allocate new threads...
[02:38:19] Assign threads for device 0
[02:38:19] Thread 0 set pool = 0 (xmr.poolmining.org)
[02:38:19] Init GPU thread 0 GPU 0 virtual GPU 0
[02:38:19] CL Platform vendor: Advanced Micro Devices, Inc.
[02:38:19] CL Platform name: AMD Accelerated Parallel Processing
[02:38:19] CL Platform version: OpenCL 2.0 AMD-APP (2450.0)
[02:38:19] Platform devices: 1
[02:38:19]      0       gfx900
[02:38:19] GPU0: detected PCIe topology 0000:25:00.0
[02:38:19] List of devices:
[02:38:19]      0       gfx900
[02:38:19] Selected 0: gfx900
[02:38:19] Preferred vector width reported 1
[02:38:19] Max work group size reported 256
[02:38:19] Maximum work size for this GPU (0) is 256.
[02:38:19] Your GPU (#0) has 56 compute units, and all AMD cards in the 7 series
 or newer (GCN cards)     have 64 shaders per compute unit - this means it has 3
584 shaders.
[02:38:19] Max mem alloc size is 6429868032
[02:38:19] Using source file cryptonight.cl
[02:38:19] GPU 0: selecting lookup gap of 2
[02:38:19] GPU 0: selecting thread concurrency of 48960
[02:38:19] Setting worksize to 256
[02:38:19] Using binary file cryptonightgfx900gw256l8.bin
[02:38:19] No binary found, generating from source
[02:38:19] Building binary cryptonightgfx900gw256l8.bin
[02:38:19] Trying to open /usr/local/bin/cryptonight.cl...
[02:38:19] Trying to open ./cryptonight.cl...
[02:38:19] Trying to open ./kernel/cryptonight.cl...
[02:38:19] Trying to open ./kernel/cryptonight.cl...
[02:38:19] Unable to open cryptonight.cl for reading!
[02:38:19] Failed to init GPU thread 0, disabling device 0
[02:38:19] Restarting the GPU from the menu will not fix this.
[02:38:19] Re-check your configuration and try restarting.
[02:38:19] thread_prepare failed for thread 0
[02:38:19] Starting device 0 mining thread 0...
Segmentation fault (core dumped)
$ dpkg -l | grep rocm
ii  rocm-dev                    1.6.148                               amd64
   Radeon Open Compute (ROCm) Runtime software stack
ii  rocm-device-libs            0.0.1                                 amd64
   Radeon Open Compute - device libraries
ii  rocm-opencl                 1.2.0-1430311                         amd64
   OpenCL/ROCm
ii  rocm-profiler               5.1.6400                              amd64
   The ROCm-Profiler is a command-line profiler for profiling applications runni
ng on the Radeon Open Compute platforms.
ii  rocm-smi                    1.0.0-25-gbdb99b4                     amd64
   System Management Interface for ROCm
ii  rocm-utils                  1.0.0                                 amd64
   AMD ROCm utilities - Utilities for ROCm platforms
gstoner commented 7 years ago

Can you update the driver to ROCm 1.6.4 it was posted last night at repo.radeon.com Just follow the install instructions at https://rocm.github.io/ROCmInstall.html

tacchinotacchi commented 7 years ago

Cryptonight now compiles, however equihash doesn't

`[09:42:57] Building binary equihashgfx803gw256l8.bin
/tmp/AMD_13450_21/t_13450_23.cl:851:22: error: variables in the local address space can only be declared in the outermost scope of a kernel function __local uint dup_counter; ^ /tmp/AMD_13450_21/t_13450_23.cl:882:22: error: variables in the local address space can only be declared in the outermost scope of a kernel function __local uint sol_i; ^ 2 errors generated. [09:42:57] Error -11: Building Program (clBuildProgram)
[09:42:57] Error: Failed to compile opencl source (from CL to LLVM IR).

[09:42:57] Failed to init GPU thread 0, disabling device 0
[09:42:57] Restarting the GPU from the menu will not fix this.
[09:42:57] Re-check your configuration and try restarting.
[09:42:57] thread_prepare failed for thread 0
[09:42:57] Building binary equihashgfx803gw256l8.bin
/tmp/AMD_13450_25/t_13450_27.cl:851:22: error: variables in the local address space can only be declared in the outermost scope of a kernel function __local uint dup_counter; ^ /tmp/AMD_13450_25/t_13450_27.cl:882:22: error: variables in the local address space can only be declared in the outermost scope of a kernel function __local uint sol_i; ^ 2 errors generated. [09:42:57] Error -11: Building Program (clBuildProgram)
[09:42:57] Error: Failed to compile opencl source (from CL to LLVM IR).

[09:42:57] Failed to init GPU thread 1, disabling device 0
[09:42:57] thread_prepare failed for thread 1
Segmentation fault `

Not sure if it's a problem of the driver or of the program, but it compiles alright on amdgpu-pro. Also I'm not sure if this driver is supposed to be faster on ethash ( the blockchain bug? don't have time to research in detail ), but it's actually slower than amdgpu-pro.

gstoner commented 7 years ago

The only difference between AMDGPUpro driver ROCm driver is AMDGPUpro use proprietary compiler now, and ROCm use fully open source compiler, right now there are app that run faster on the new open source compiler and there that need to tuned since we never run them.

G