Closed kkkrackpot closed 5 years ago
CUDA 9.0 is broken. Please don't use it. Please try 8.0 or 9.1 Also, try ccminer 8.17
@KlausT I will try that, thanks!
Recompiled cuda9
branch from the latest git.
Same thing:
$ ./ccminer --benchmark -a x17
ccminer 8.18-KlausT (64bit) for nVidia GPUs
Compiled with GCC 6.4 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork
CUDA support by Christian Buchner, Christian H. and DJM34
Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
0
[2017-12-29 22:14:51] 1 miner thread started, using 'x17' algorithm.
Cuda error in func 'x17_haval256_cpu_hash_64' at line 343 : an illegal memory access was encountered.
ccminer --benchmark
seems not working too.
PS. dmesg
says:
[ 3530.937103] NVRM: GPU Board Serial Number:
[ 3530.937105] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000020, engmask 00000101, intr 10000000
[ 3928.388993] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000020, engmask 00000101, intr 10000000
UPD. I have memory operations on (NVreg_EnableStreamMemOPs=1
, but doubt it matters).
It looks like only Linux users are having problems. I really don't know why.
I'm trying the cuda9
branch as well and get the same error.
I built the branch cuda9 with arch, using this :
./autogen.sh
./configure CUDA_CFLAGS='--shared --compiler-options "-fPIC"' \
--prefix=/usr --sysconfdir=/etc --libdir=/usr/lib --with-cuda=/opt/cuda
make
$ ./ccminer -a x17 -o stratum+tcp://x17.mine.ahashpool.com:3737 -u xxx -p ID=Rig01,c=BTC,d=0.084
ccminer 8.18-KlausT (64bit) for nVidia GPUs
Compiled with GCC 7.2 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork
CUDA support by Christian Buchner, Christian H. and DJM34
Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
0
[2018-01-05 12:34:27] 1 miner thread started, using 'x17' algorithm.
[2018-01-05 12:34:27] Starting Stratum on stratum+tcp://x17.mine.ahashpool.com:3737
[2018-01-05 12:34:27] Stratum difficulty set to 0.24
[2018-01-05 12:34:27] x17.mine.ahashpool.com:3737 x17 block 1775732
Cuda error in func 'x17_haval256_cpu_hash_64' at line 343 : an illegal memory access was encountered.
[2018-01-05 12:34:30] stopping 1 threads
[2018-01-05 12:34:35] resetting GPUs
[2018-01-05 12:34:37] stopping 1 threads
and dmsg:
[63949.817123] NVRM: GPU at PCI:0000:01:00: GPU-d11cb714-e4bd-d74f-bd9c-69f08648aa05
[63949.817126] NVRM: GPU Board Serial Number:
[63949.817127] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000018, engmask 00000101, intr 10000000
@kkkrackpot did you found a solution?
@jeremi Nope, I didn't find any solution yet... I suspect something needs fixing in that cuda kernel, but I don't know cuda that much.
@KlausT I tried also with ccminer 8.17 as you suggested and I have the same issue.
After recent commits it still fails (and segfaults) on x17:
ccminer 8.18-KlausT (64bit) for nVidia GPUs
Compiled with GCC 6.4 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork
CUDA support by Christian Buchner, Christian H. and DJM34
Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
0
[2018-01-07 00:50:43] 1 miner thread started, using 'x17' algorithm.
Cuda error in func 'x17_haval256_cpu_hash_64' at line 343 : an illegal memory access was encountered.
dmesg says:
[29195.923021] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000011, engmask 00000101, intr 10000000
[29206.981778] traps: ccminer[26632] general protection ip:7f2790263a55 sp:7f27878316b0 error:0 in libc-2.25.so[7f279022e000+18b000]
PS. Similar thing with qubit
(but without a segfault):
Cuda error in func 'x11_echo512_cpu_hash_64_final' at line 674 : an illegal memory access was encountered.
Could you please test the memorydebug branch with X17: https://github.com/KlausT/ccminer/tree/memorydebug Then I can see if it's in the haval kernel, or if it's the memcpy. I can't see any problem in the code there
@KlausT Tried that with --debug
, it says:
[2018-01-07 11:29:05] Binding thread 0 to cpu 0 (mask 1)
0
[2018-01-07 11:29:05] 1 miner thread started, using 'x17' algorithm.
[2018-01-07 11:29:05] thread 0: new work
[2018-01-07 11:29:05] GPU #0: start=00000000 end=017d7840 range=017d7841
Cuda error in func 'x11_simd512_cpu_hash_64' at line 797 : an illegal memory access was encountered.
[2018-01-07 11:29:06] stopping 1 threads
....
[ 804.941804] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000018, engmask 00000101, intr 10000000
PS. I have GTX 1060, so I build only for compute=61,sm=61
PPS. Also cuda-memcheck
output is here https://pastebin.com/pupR44Hg (unfortunately, I don't know how to debug cuda properly)
UPD. I recompiled the miner with nvcc -G -g
, now cuda-memcheck
gives another error:
========= Invalid __global__ write of size 4
========= at 0x00035988 in /tmp/ccminer/./x11/simd_functions.cu:1310:x11_simd512_gpu_compress1_64(unsigned int, unsigned int, unsigned long*, uint4*, unsigned int*)
========= by thread (64,0,0) in block (0,0,0)
========= Address 0x00000000 is out of bounds
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:/usr/lib64/libcuda.so.1 (cuLaunchKernel + 0x2cd) [0x22b12d]
========= Host Frame:/opt/cuda/lib64/libcudart.so.9.1 [0x15f70]
========= Host Frame:/opt/cuda/lib64/libcudart.so.9.1 (cudaLaunch + 0x14e) [0x347be]
========= Host Frame:./ccminer [0x16176d]
========= Host Frame:./ccminer [0x16103e]
========= Host Frame:./ccminer [0x161085]
========= Host Frame:./ccminer [0x160af4]
========= Host Frame:./ccminer [0x1683fb]
========= Host Frame:./ccminer [0xa550]
========= Host Frame:/lib64/libpthread.so.0 [0x73e4]
========= Host Frame:/lib64/libc.so.6 (clone + 0x3f) [0xe86ff]
=========
========= Program hit cudaErrorLaunchFailure (error 4) due to "unspecified launch failure" on CUDA API call to cudaDeviceSynchronize.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib64/libcuda.so.1 [0x3110a3]
========= Host Frame:/opt/cuda/lib64/libcudart.so.9.1 (cudaDeviceSynchronize + 0x180) [0x34ba0]
========= Host Frame:./ccminer [0x160af9]
========= Host Frame:./ccminer [0x1683fb]
Cuda error in func 'x11_simd512_cpu_hash_64' at line 797 : unspecified launch failure.
========= Host Frame:./ccminer [0xa550]
========= Host Frame:/lib64/libpthread.so.0 [0x73e4]
========= Host Frame:/lib64/libc.so.6 (clone + 0x3f) [0xe86ff]
=========
[2018-01-07 12:02:42] stopping 1 threads
[2018-01-07 12:02:47] resetting GPUs
ok, so it wasn't the haval kernel at all. I will look at the simd kernel then.
Your Pascal card is using the Kepler kernels? Wtf
ok, x17 on Maxwell and Pascal cards should work now, I hope.
Thanks @KlausT. I do not see the original error, but there seems to still be an issue (cuda9
branch) :
I see messages like : GPU #0: result for 0ea1bb0c does not validate on CPU!
and never get confirmation from the pool.
ccminer 8.18-KlausT (64bit) for nVidia GPUs
Compiled with GCC 7.2 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork
CUDA support by Christian Buchner, Christian H. and DJM34
Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
0
[2018-01-08 11:28:50] Starting Stratum on stratum+tcp://x17.mine.ahashpool.com:3737
[2018-01-08 11:28:50] 1 miner thread started, using 'x17' algorithm.
[2018-01-08 11:28:50] Stratum difficulty set to 0.24
[2018-01-08 11:28:53] GPU #0: waiting for data
[2018-01-08 11:28:55] x17.mine.ahashpool.com:3737 x17 block 1783350
[2018-01-08 11:29:20] GPU #0: (null), 15.50 MH/s
[2018-01-08 11:29:22] x17.mine.ahashpool.com:3737 x17 block 1783351
[2018-01-08 11:29:22] GPU #0: (null), 15.51 MH/s
[2018-01-08 11:29:38] x17.mine.ahashpool.com:3737 x17 block 1783352
[2018-01-08 11:29:38] GPU #0: (null), 15.27 MH/s
[2018-01-08 11:29:51] x17.mine.ahashpool.com:3737 x17 block 1783352
[2018-01-08 11:29:51] GPU #0: (null), 15.43 MH/s
[2018-01-08 11:30:13] x17.mine.ahashpool.com:3737 x17 block 1783352
[2018-01-08 11:30:13] GPU #0: (null), 15.25 MH/s
[2018-01-08 11:30:32] x17.mine.ahashpool.com:3737 x17 block 1783353
[2018-01-08 11:30:32] GPU #0: (null), 15.31 MH/s
[2018-01-08 11:30:36] x17.mine.ahashpool.com:3737 x17 block 1783354
[2018-01-08 11:30:36] GPU #0: (null), 15.40 MH/s
[2018-01-08 11:30:48] x17.mine.ahashpool.com:3737 x17 block 1783355
[2018-01-08 11:30:48] GPU #0: (null), 15.47 MH/s
[2018-01-08 11:30:59] x17.mine.ahashpool.com:3737 x17 block 1783356
[2018-01-08 11:30:59] GPU #0: (null), 15.49 MH/s
[2018-01-08 11:31:03] x17.mine.ahashpool.com:3737 x17 block 1783357
[2018-01-08 11:31:03] GPU #0: (null), 15.44 MH/s
[2018-01-08 11:31:18] x17.mine.ahashpool.com:3737 x17 block 1783358
[2018-01-08 11:31:18] GPU #0: (null), 15.48 MH/s
[2018-01-08 11:31:23] x17.mine.ahashpool.com:3737 x17 block 1783359
[2018-01-08 11:31:23] GPU #0: (null), 15.46 MH/s
[2018-01-08 11:31:25] x17.mine.ahashpool.com:3737 x17 block 1783360
[2018-01-08 11:31:25] GPU #0: (null), 15.52 MH/s
[2018-01-08 11:31:28] x17.mine.ahashpool.com:3737 x17 block 1783361
[2018-01-08 11:31:28] GPU #0: (null), 15.54 MH/s
[2018-01-08 11:31:33] x17.mine.ahashpool.com:3737 x17 block 1783362
[2018-01-08 11:31:33] GPU #0: (null), 15.49 MH/s
[2018-01-08 11:31:43] x17.mine.ahashpool.com:3737 x17 block 1783363
[2018-01-08 11:31:43] GPU #0: (null), 15.52 MH/s
[2018-01-08 11:31:54] x17.mine.ahashpool.com:3737 x17 block 1783364
[2018-01-08 11:31:54] GPU #0: (null), 15.44 MH/s
[2018-01-08 11:32:10] GPU #0: result for 0ea1bb0c does not validate on CPU!
[2018-01-08 11:32:15] x17.mine.ahashpool.com:3737 x17 block 1783365
[2018-01-08 11:32:15] GPU #0: (null), 15.51 MH/s
[2018-01-08 11:32:20] x17.mine.ahashpool.com:3737 x17 block 1783366
[2018-01-08 11:32:20] GPU #0: (null), 15.44 MH/s
[2018-01-08 11:32:29] x17.mine.ahashpool.com:3737 x17 block 1783366
[2018-01-08 11:32:29] GPU #0: (null), 15.37 MH/s
[2018-01-08 11:32:40] x17.mine.ahashpool.com:3737 x17 block 1783366
[2018-01-08 11:32:40] GPU #0: (null), 15.37 MH/s
[2018-01-08 11:32:41] x17.mine.ahashpool.com:3737 x17 block 1783367
[2018-01-08 11:32:41] GPU #0: (null), 15.52 MH/s
[2018-01-08 11:32:51] x17.mine.ahashpool.com:3737 x17 block 1783367
[2018-01-08 11:32:51] GPU #0: (null), 15.32 MH/s
[2018-01-08 11:33:02] x17.mine.ahashpool.com:3737 x17 block 1783367
[2018-01-08 11:33:02] GPU #0: (null), 15.06 MH/s
[2018-01-08 11:33:02] GPU #0: result for 00758499 does not validate on CPU!
[2018-01-08 11:33:24] x17.mine.ahashpool.com:3737 x17 block 1783367
[2018-01-08 11:33:24] GPU #0: (null), 15.32 MH/s
[2018-01-08 11:33:25] GPU #0: result for 00d985ca does not validate on CPU!
[2018-01-08 11:33:34] x17.mine.ahashpool.com:3737 x17 block 1783368
[2018-01-08 11:33:34] GPU #0: (null), 15.37 MH/s
[2018-01-08 11:33:59] GPU #0: (null), 15.40 MH/s
[2018-01-08 11:34:00] GPU #0: result for 00bdc4c2 does not validate on CPU!
^C[2018-01-08 11:34:00] SIGINT received, exiting
[2018-01-08 11:34:00] stopping 1 threads
[2018-01-08 11:34:00] resetting GPUs
What happens when you use the option --no-cpu-verify ?
It seems to work:
[2018-01-08 14:58:33] Starting Stratum on stratum+tcp://x17.mine.ahashpool.com:3737
[2018-01-08 14:58:33] Stratum difficulty set to 0.24
[2018-01-08 14:58:36] GPU #0: waiting for data
[2018-01-08 14:58:39] GPU #0: waiting for data
[2018-01-08 14:58:42] GPU #0: waiting for data
[2018-01-08 14:58:43] x17.mine.ahashpool.com:3737 x17 block 1783738
[2018-01-08 14:58:48] x17.mine.ahashpool.com:3737 x17 block 1783739
[2018-01-08 14:58:48] GPU #0: (null), 15.71 MH/s
[2018-01-08 14:59:05] x17.mine.ahashpool.com:3737 x17 block 1783739
[2018-01-08 14:59:05] GPU #0: (null), 15.63 MH/s
[2018-01-08 14:59:16] x17.mine.ahashpool.com:3737 x17 block 1783739
[2018-01-08 14:59:16] GPU #0: (null), 15.59 MH/s
[2018-01-08 14:59:21] GPU #0: (null), 14.74 MH/s
[2018-01-08 14:59:21] accepted: 1/1 (100.00%), 15.42 MH/s yay!!!
[2018-01-08 14:59:25] x17.mine.ahashpool.com:3737 x17 block 1783740
[2018-01-08 14:59:25] GPU #0: (null), 14.99 MH/s
[2018-01-08 14:59:38] x17.mine.ahashpool.com:3737 x17 block 1783740
[2018-01-08 14:59:38] GPU #0: (null), 15.09 MH/s
[2018-01-08 14:59:45] x17.mine.ahashpool.com:3737 x17 block 1783741
[2018-01-08 14:59:45] GPU #0: (null), 15.10 MH/s
[2018-01-08 15:00:10] GPU #0: (null), 15.10 MH/s
[2018-01-08 15:00:11] x17.mine.ahashpool.com:3737 x17 block 1783741
[2018-01-08 15:00:11] GPU #0: (null), 15.19 MH/s
[2018-01-08 15:00:21] x17.mine.ahashpool.com:3737 x17 block 1783742
[2018-01-08 15:00:21] GPU #0: (null), 15.60 MH/s
[2018-01-08 15:00:34] x17.mine.ahashpool.com:3737 x17 block 1783743
[2018-01-08 15:00:34] GPU #0: (null), 15.60 MH/s
[2018-01-08 15:00:37] x17.mine.ahashpool.com:3737 x17 block 1783744
[2018-01-08 15:00:37] GPU #0: (null), 15.59 MH/s
[2018-01-08 15:00:46] x17.mine.ahashpool.com:3737 x17 block 1783744
[2018-01-08 15:00:46] GPU #0: (null), 15.58 MH/s
[2018-01-08 15:00:48] GPU #0: (null), 15.64 MH/s
[2018-01-08 15:00:48] accepted: 2/2 (100.00%), 15.37 MH/s yay!!!
Thanks!
@KlausT I really don't know why my 1060 uses a Kepler kernel. Moreover, I shouldn't even have one, because before builds I manually edit Makefile.am
to have 61 only...
With recent memorydebug
branch I have same results as @jeremi . Also it seems to find the right kernel:
[2018-01-08 19:52:55] 1 miner thread started, using 'x17' algorithm.
0
[2018-01-08 19:52:55] GPU #0: device_sm = 610
[2018-01-08 19:52:58] GPU #0 Found nonce 00f3d2a3
[2018-01-08 19:53:02] GPU #0 Found nonce 021ad1ab
[2018-01-08 19:53:02] GPU #0: (null), 5149.15 kH/s
[2018-01-08 19:53:02] Total: 5149.15 kH/s
PS qubit
benchmark now works too, same way as x17
. lyra2v2
is still broken.
PPS. Will you merge these patches into main branch, or it needs more testing?
BTW, is it possible to add something to make the miner to print default intensity when started without -i
option?
I have a similar problem.
ccminer$ ./ccminer -a x17 --benchmark --debug ccminer 8.18-KlausT (64bit) for nVidia GPUs Compiled with GCC 5.4 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork CUDA support by Christian Buchner, Christian H. and DJM34 Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
[2018-01-08 21:25:47] NVML GPU monitoring enabled. [2018-01-08 21:25:47] 1 miner thread started, using 'x17' algorithm. [2018-01-08 21:25:47] Binding thread 0 to cpu 0 (mask 1) 0 [2018-01-08 21:25:47] thread 0: new work [2018-01-08 21:25:47] GPU #0: start=00000000 end=017d7840 range=017d7841 Cuda error in func 'x17_haval256_cpu_hash_64' at line 343 : an illegal memory access was encountered. [2018-01-08 21:25:50] stopping 1 threads [2018-01-08 21:25:55] resetting GPUs ccminer$ cuda-memcheck ./ccminer -a x17 --benchmark --debug ========= CUDA-MEMCHECK ccminer 8.18-KlausT (64bit) for nVidia GPUs Compiled with GCC 5.4 using Nvidia CUDA Toolkit 9.1
Based on pooler cpuminer 2.3.2 and the tpruvot@github fork CUDA support by Christian Buchner, Christian H. and DJM34 Includes optimizations implemented by sp-hash, klaust, tpruvot and tsiv.
[2018-01-08 21:26:03] NVML GPU monitoring enabled. [2018-01-08 21:26:03] 1 miner thread started, using 'x17' algorithm. [2018-01-08 21:26:03] Binding thread 0 to cpu 0 (mask 1) 0 [2018-01-08 21:26:03] thread 0: new work [2018-01-08 21:26:03] GPU #0: start=00000000 end=017d7840 range=017d7841 Cuda error in func 'x11_luffaCubehash512_cpu_hash_64' at line 1277 : the launch timed out and was terminated. [2018-01-08 21:26:15] stopping 1 threads [2018-01-08 21:26:20] resetting GPUs ========= ERROR SUMMARY: 0 errors
I not familiar with cuda, but I want to fix this. Could you help me to make a first stem to resolve this problem?
P.S.: The last change from git log is 'Sun Jan 7 23:07:14 2018 +0100'
The latest commits should fix the x17 problem, but you have to use the --no-cpu-verify option for now. For some reason a part of the CPU code is broken under Linux. Maybe a GCC issue, version 7 is not really compatible.
Hi,
Self-compiled ccminer on Linux, both
windows
andcuda-9
branches:UPD. Benchmark mode seems "broken" in general:
ccminer --benchmark
starts benching bitcoin warning about bad CPU validation -- and that's all it does...