fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

AMD Invalid Result GPU #2262

Open yoburtu opened 5 years ago

yoburtu commented 5 years ago

pools.txt xmr-stak.log

Please provide as much as possible information to reproduce the issue.

Basic information

Compile issues

Ubuntu 18.04

add all commands you used and the full compile output here

$ cmake -DOpenCL_LIBRARY=/opt/amdgpu-pro/lib/x86_64-linux-gnu/libOpenCL.so -DCUDA_ENABLE=OFF .. $ make install

run cmake -LA . in the build folder and add the output here

CMake Error: The current CMakeCache.txt directory /home/users/Downloads/xmr-stak-2.8.3/build/CMakeCache.txt is different than the directory /home/user/Downloads/xmr-stak/build where CMakeCache.txt was created. This may result in binaries being created in the wrong place. If you are not sure, reedit the CMakeCache.txt -- Cache values CMAKE_AR:FILEPATH=/usr/bin/ar CMAKE_ASM_COMPILER:FILEPATH=/usr/bin/cc CMAKE_ASM_COMPILER_AR:FILEPATH=/usr/bin/gcc-ar CMAKE_ASM_COMPILER_RANLIB:FILEPATH=/usr/bin/gcc-ranlib CMAKE_ASM_FLAGS:STRING= CMAKE_ASM_FLAGS_DEBUG:STRING=-g CMAKE_ASM_FLAGS_MINSIZEREL:STRING=-Os -DNDEBUG CMAKE_ASM_FLAGS_RELEASE:STRING=-O3 -DNDEBUG CMAKE_ASM_FLAGS_RELWITHDEBINFO:STRING=-O2 -g -DNDEBUG CMAKE_BUILD_TYPE:STRING=Release CMAKE_COLOR_MAKEFILE:BOOL=ON CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/c++ CMAKE_CXX_COMPILER_AR:FILEPATH=/usr/bin/gcc-ar-7 CMAKE_CXX_COMPILER_RANLIB:FILEPATH=/usr/bin/gcc-ranlib-7 CMAKE_CXX_FLAGS:STRING= CMAKE_CXX_FLAGS_DEBUG:STRING=-g CMAKE_CXX_FLAGS_MINSIZEREL:STRING=-Os -DNDEBUG CMAKE_CXX_FLAGS_RELEASE:STRING=-O3 -DNDEBUG CMAKE_CXX_FLAGS_RELWITHDEBINFO:STRING=-O2 -g -DNDEBUG CMAKE_C_COMPILER:FILEPATH=/usr/bin/cc CMAKE_C_COMPILER_AR:FILEPATH=/usr/bin/gcc-ar-7 CMAKE_C_COMPILER_RANLIB:FILEPATH=/usr/bin/gcc-ranlib-7 CMAKE_C_FLAGS:STRING= CMAKE_C_FLAGS_DEBUG:STRING=-g CMAKE_C_FLAGS_MINSIZEREL:STRING=-Os -DNDEBUG CMAKE_C_FLAGS_RELEASE:STRING=-O3 -DNDEBUG CMAKE_C_FLAGS_RELWITHDEBINFO:STRING=-O2 -g -DNDEBUG CMAKE_EXE_LINKER_FLAGS:STRING= CMAKE_EXE_LINKER_FLAGS_DEBUG:STRING= CMAKE_EXE_LINKER_FLAGS_MINSIZEREL:STRING= CMAKE_EXE_LINKER_FLAGS_RELEASE:STRING= CMAKE_EXE_LINKER_FLAGS_RELWITHDEBINFO:STRING= CMAKE_EXPORT_COMPILE_COMMANDS:BOOL=OFF CMAKE_INSTALL_PREFIX:PATH=/home/usuario/Descargas/xmr-stak/build CMAKE_LINKER:FILEPATH=/usr/bin/ld CMAKE_LINK_STATIC:BOOL=OFF CMAKE_MAKE_PROGRAM:FILEPATH=/usr/bin/make CMAKE_MODULE_LINKER_FLAGS:STRING= CMAKE_MODULE_LINKER_FLAGS_DEBUG:STRING= CMAKE_MODULE_LINKER_FLAGS_MINSIZEREL:STRING= CMAKE_MODULE_LINKER_FLAGS_RELEASE:STRING= CMAKE_MODULE_LINKER_FLAGS_RELWITHDEBINFO:STRING= CMAKE_NM:FILEPATH=/usr/bin/nm CMAKE_OBJCOPY:FILEPATH=/usr/bin/objcopy CMAKE_OBJDUMP:FILEPATH=/usr/bin/objdump CMAKE_RANLIB:FILEPATH=/usr/bin/ranlib CMAKE_SHARED_LINKER_FLAGS:STRING= CMAKE_SHARED_LINKER_FLAGS_DEBUG:STRING= CMAKE_SHARED_LINKER_FLAGS_MINSIZEREL:STRING= CMAKE_SHARED_LINKER_FLAGS_RELEASE:STRING= CMAKE_SHARED_LINKER_FLAGS_RELWITHDEBINFO:STRING= CMAKE_SKIP_INSTALL_RPATH:BOOL=NO CMAKE_SKIP_RPATH:BOOL=NO CMAKE_STATIC_LINKER_FLAGS:STRING= CMAKE_STATIC_LINKER_FLAGS_DEBUG:STRING= CMAKE_STATIC_LINKER_FLAGS_MINSIZEREL:STRING= CMAKE_STATIC_LINKER_FLAGS_RELEASE:STRING= CMAKE_STATIC_LINKER_FLAGS_RELWITHDEBINFO:STRING= CMAKE_STRIP:FILEPATH=/usr/bin/strip CMAKE_VERBOSE_MAKEFILE:BOOL=FALSE CPU_ENABLE:BOOL=ON CUDA_ENABLE:BOOL=OFF EXECUTABLE_OUTPUT_PATH:STRING=bin HWLOC:FILEPATH=/usr/lib/x86_64-linux-gnu/libhwloc.so HWLOC_ENABLE:BOOL=ON HWLOC_INCLUDE_DIR:PATH=/usr/include LIBRARY_OUTPUT_PATH:STRING=bin MHTD:FILEPATH=/usr/lib/x86_64-linux-gnu/libmicrohttpd.so MICROHTTPD_ENABLE:BOOL=ON MTHD_INCLUDE_DIR:PATH=/usr/include OPENSSL_CRYPTO_LIBRARY:FILEPATH=/usr/lib/x86_64-linux-gnu/libcrypto.so OPENSSL_INCLUDE_DIR:PATH=/usr/include OPENSSL_SSL_LIBRARY:FILEPATH=/usr/lib/x86_64-linux-gnu/libssl.so OpenCL_ENABLE:BOOL=ON OpenCL_INCLUDE_DIR:PATH=/usr/include OpenCL_LIBRARY:FILEPATH=/opt/amdgpu-pro/lib/x86_64-linux-gnu/libOpenCL.so OpenSSL_ENABLE:BOOL=ON PKG_CONFIG_EXECUTABLE:FILEPATH=PKG_CONFIG_EXECUTABLE-NOTFOUND XMR-STAK_COMPILE:STRING=native

Issue with the execution

Version: xmr-stak/2.8.3/e785ca1/master/lin/amd-cpu/0

AMD OpenCl issue

AMD Invalid Result GPU

run clinfo and add the output here

**Number of platforms 1 Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.1 AMD-APP (2580.4) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Host timer resolution 1ns Platform Extensions function suffix AMD

Platform Name AMD Accelerated Parallel Processing Number of devices 6 Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 01:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1250MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 02:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1200MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 3487219488 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 03:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1150MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 05:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1250MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 06:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1250MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Ellesmere Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 AMD-APP (2580.4) Driver Version 2580.4 Device OpenCL C Version OpenCL C 1.2 Device Type GPU Device Board Name (AMD) Radeon (TM) RX 470 Graphics Device Topology (AMD) PCI-E, 07:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 32 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1256MHz Graphics IP (AMD) 8.0 Device Partition (core) Max number of sub-devices 32 Supported partition types none specified Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 473677824 (451.7MiB) Global free memory (AMD) 442992 (432.6MiB) Global memory channels (AMD) 8 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 222905958 (212.6MiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 222905958 (212.6MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1551205368424207059ns (Tue Feb 26 19:22:48 2019) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] Success [AMD] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name Ellesmere clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (6) Platform Name AMD Accelerated Parallel Processing Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (6) Platform Name AMD Accelerated Parallel Processing Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere Device Name Ellesmere**

Stability issue

yoburtu commented 5 years ago

With this in pools.txt, WORKS FINE: currency" : "cryptonight_v8_zelerius",

With this in pools.txt, INVALID SHARES: "currency" : "zelerius",

Regards.

Spudz76 commented 5 years ago

/usr/include is usually the wrong/worst place for OpenCL headers, they will be Mesa/Clover junk (not always compatible).

Best headers from here purge anything that owns /usr/include/CL (mesa junk, opencl-dev, whatever) and then put that CL folder in there, in which (and only) case the /usr/include path will be fine.

Otherwise pass the location of the Khronos CL folder (without the actual CL part) via -DOpenCL_INCLUDE= Things include it as <CL/whatever.h> thus why you just move the CL folder around and do not add itself (but only its parent) in the include path.

Library seems fine, but you probably get better results from ROCm on RX (amdgpu-pro best for R9 or worse).

ghost commented 5 years ago

It works well to me using CPU only under macOS mojave 10.14.3

cmake .. -DCUDA_ENABLE=OFF -DOpenCL_ENABLE=OFF

pools.txt

./xmr-stak 
[2019-02-26 21:18:42] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:42] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:42] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:42] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:42] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:42] : Cryptonight hash self-test NOT defined for POW cryptonight_v8_zelerius
-------------------------------------------------------------------
xmr-stak 2.8.3 e785ca1

Brought to you by fireice_uk and psychocrypt under GPLv3.
Based on CPU mining code by wolf9466 (heavily optimized by fireice_uk).

Configurable dev donation level is set to 2.0%

-------------------------------------------------------------------
You can use following keys to display reports:
'h' - hashrate
'r' - results
'c' - connection
-------------------------------------------------------------------
Upcoming xmr-stak-gui is sponsored by:
   #####   ______               ____
 ##     ## | ___ \             /  _ \
#    _    #| |_/ /_   _   ___  | / \/ _   _  _ _  _ _  ___  _ __    ___  _   _
#   |_|   #|    /| | | | / _ \ | |   | | | || '_|| '_|/ _ \| '_ \  / __|| | | |
#         #| |\ \| |_| || (_) || \_/\| |_| || |  | | |  __/| | | || (__ | |_| |
 ##     ## \_| \_|\__, | \___/ \____/ \__,_||_|  |_|  \___||_| |_| \___| \__, |
   #####           __/ |                                                  __/ |
                  |___/   https://ryo-currency.com                       |___/

This currency is a way for us to implement the ideas that we were unable to in
Monero. See https://github.com/fireice-uk/cryptonote-speedup-demo for details.
-------------------------------------------------------------------
[2019-02-26 21:18:42] : Mining coin: cryptonight_v8_zelerius
[2019-02-26 21:18:42] : WARNING on macOS thread affinity is only advisory.
[2019-02-26 21:18:42] : Starting 1x thread, affinity: 0.
[2019-02-26 21:18:42] : hwloc: set_thisthread_membind not supported
[2019-02-26 21:18:42] : WARNING on macOS thread affinity is only advisory.
[2019-02-26 21:18:42] : Starting 1x thread, affinity: 2.
[2019-02-26 21:18:43] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:43] : Switch to assembler version for 'intel_avx' cpu's
[2019-02-26 21:18:43] : hwloc: set_thisthread_membind not supported
[2019-02-26 21:18:43] : MEMORY ALLOC FAILED: mmap failed, check attribute 'use_slow_memory' in 'config.txt'
[2019-02-26 21:18:43] : Switch to assembler version for 'intel_avx' cpu's
[2019-02-26 21:18:43] : Fast-connecting to xyztest.zelerius.org:9292 pool ...
[2019-02-26 21:18:43] : Pool xyztest.zelerius.org:9292 connected. Logging in...
[2019-02-26 21:18:43] : Difficulty changed. Now: 1500.
[2019-02-26 21:18:43] : Pool logged in.
[2019-02-26 21:19:11] : New block detected.
[2019-02-26 21:19:11] : Difficulty changed. Now: 804.
[2019-02-26 21:19:11] : New block detected.
[2019-02-26 21:19:25] : Result accepted by the pool.
[2019-02-26 21:19:25] : New block detected.
[2019-02-26 21:19:36] : Result accepted by the pool.
[2019-02-26 21:19:36] : New block detected.
[2019-02-26 21:19:41] : Difficulty changed. Now: 965.
[2019-02-26 21:19:41] : New block detected.
psychocrypt commented 5 years ago

The currency zelerius is configured to fork from cryptonight_v8 to cryptonight_v8_zelerius with the block version 7. It could be that this coin has no block-information within the job. In that case the miner will not be able to fork. Please use cryptonight_v8_zelerius. I will keep this issue open as todo.

yoburtu commented 5 years ago

The currency zelerius is configured to fork from cryptonight_v8 to cryptonight_v8_zelerius with the block version 7. It could be that this coin has no block-information within the job. In that case the miner will not be able to fork. Please use cryptonight_v8_zelerius. I will keep this issue open as todo.

I have tested with “currency”: “zelerius” with CPU only and works fine.

Best regards.

ghost commented 5 years ago

Yes, some users have reported problems with AMD cards it seems the automatic switch doesn’t work well, maybe there is a problem in AMD code. I’m testing it here https://xyztest.zelerius.org but I haven’t AMD card.

@yoburtu It’s strange because the zelerius option detects the switch algorithm -> cryptonight_v8_zelerius

‘’’ [2019-02-26 20:53:39] : Cryptonight hash self-test NOT defined for POW cryptonight_v8_zelerius ‘’’

Pilott73 commented 5 years ago

new miner. xmr-stak-win64-2.9.0. GPU-AMD this is the "zelerius" option, 2019-03-06 08: 59: 54] : Cryptonight hash self-test NOT defined for POW cryptonight_v8_zelerius [2019-03-06 08:59 : 54]: Cryptonight hash self-test failed. This might be caused by bad compiler optimizations. [2019-03-06 08:59 : 54]: Self test not passed!

With this in pools.txt, WORKS FINE: option" : "cryptonight_v8_zelerius",

Regards.

Pilott73 commented 5 years ago

GPU-Nvidia is the same.

ghost commented 5 years ago

It seems the self-test for zelerius is not defined it should be as follows:

cryptonight_v8_zelerius

("This is a test This is a test This is a test") (“\x64\x8b\xdf\xaa\xf0\x54\x4a\x7e\xdc\xa6\x08\xc0\x6d\xea\xed\x66\xd8\x98\x45\x12\xc7\x51\x5b\x3d\x7b\xbc\x4e\x82\x4b\xe8\xc3\x53", 32 )

Thank you.

yoburtu commented 5 years ago

The currency zelerius is configured to fork from cryptonight_v8 to cryptonight_v8_zelerius with the block version 7. It could be that this coin has no block-information within the job. In that case the miner will not be able to fork. Please use cryptonight_v8_zelerius. I will keep this issue open as todo.

Hi,

any news about automatic switch of Zelerius algorithm?.

Best regards.

psychocrypt commented 5 years ago

Sry no I need all my time for the upcoming monero fork.

yoburtu commented 5 years ago

Sry no I need all my time for the upcoming monero fork.

Hi. What about this problem?. any news about automatic switch of Zelerius algorithm?.

The fork will be at block 534800, in 11 hours, more or less.

Best regards.