Illegal memory access on GTX 1060 6g

rumatoest commented 7 years ago

Really strange Issue. Asus DUAL GTX 1060 6G fails to start because of next error

GPU 1: an illegal memory access was encountered
cryptonight/cuda_cryptonight_core.cu line 255

But it would not fail, if it is already mining ethereum.

So I can start ethereum mining on this GPU, then start ccminer, stop ethereum miner. And WTF - it is working.

paulomalvar commented 7 years ago

I get the same error using two TITAN X (Pascal) (28 SMX) mining monero, but in my case it fails. This morning I installed the newest Nvidia driver (cuda_8.0.61_375.26_linux.run) and cudnn (cudnn-8.0-linux-x64-v6.0.tgz). I have to use that version of cudnn because it is the only that tensorflow 1.3.0 admits. So I rebuilt ccminer-cryptonight, with no errors, and now I get that "illegal memory access" error.

KlausT commented 7 years ago

Do you still get this error when using --bfactor=0 ?

paulomalvar commented 7 years ago

Yes, I still do:

[2017-09-12 09:46:57] Starting Stratum on stratum+tcp://pool.minexmr.com:5555 [2017-09-12 09:46:57] 2 miner threads started [2017-09-12 09:46:57] GPU #0: TITAN X (Pascal) (28 SMX), using 112 blocks of 32 threads [2017-09-12 09:46:57] GPU #1: TITAN X (Pascal) (28 SMX), using 112 blocks of 32 threads [2017-09-12 09:46:57] Pool set diff to 15000 [2017-09-12 09:46:57] Stratum detected new block

GPU 0: an illegal memory access was encountered cryptonight/cuda_cryptonight_core.cu line 255

GPU 1: OS call failed or operation not supported on this OS cryptonight/cryptonight.cu line 218

rumatoest commented 7 years ago

Me too

    *** ccminer-cryptonight 2.04 (64 bit) for nVidia GPUs by tsiv and KlausT 
    *** Built with GCC 5.4 using the Nvidia CUDA Toolkit 8.0

 tsiv's BTC donation address:   1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs
 KlausT's BTC donation address: 1QHH2dibyYL5iyMDk3UN4PVvFVtrWD8QKp
 for more donation addresses please read the README.txt
-----------------------------------------------------------------
[2017-09-12 19:58:34] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-09-12 19:58:34] 1 miner threads started
[2017-09-12 19:58:34] GPU #1: GeForce GTX 1060 6GB (10 SMX), using 40 blocks of 64 threads
[2017-09-12 19:58:34] Pool set diff to 1063
[2017-09-12 19:58:34] Stratum detected new block

GPU 1: an illegal memory access was encountered
cryptonight/cuda_cryptonight_core.cu line 255

rumatoest commented 7 years ago

I do not know how to debug it, but if you will help me I will.

KlausT commented 7 years ago

It looks like it's only making problems under Linux. I'm using Windows and it works just fine with my 1070. Well, maybe I'm lucky and I will find this bug anyway.

rumatoest commented 7 years ago

This is card specific issue. It works on my setups on 1060 3G and 1070 8G, but it fails on 1060 6G. Maybe on 6G GPU just some wrong offset to memory passed. The curious part is that if card already on heavy load - it starts without issues.

Could you suggest me how can I debug it and gather information when miner starts and fails?

KlausT commented 7 years ago

I have created the branch "memdebug": https://github.com/KlausT/ccminer-cryptonight/tree/memdebug When you compile and run it you can see how much memory ccminer thinks the card has, and the size of the biggest memory block that's being allocated. I don't think we will see anything interesting there, but you never know ...

rumatoest commented 7 years ago

Ok it is success start while ethereum miner running

[2017-09-13 11:02:51] GPU #1: 5978718208 bytes free, 8506769408 bytes total
[2017-09-13 11:02:51] GPU #1: 5978718208 bytes free, 8506769408 bytes total
[2017-09-13 11:02:51] GPU #2: 5978718208 bytes free, 8506769408 bytes total
[2017-09-13 11:02:51] GPU #3: 5978718208 bytes free, 8506769408 bytes total
[2017-09-13 11:02:51] GPU #1: GeForce GTX 1060 6GB (10 SMX), using 40 blocks of 32 threads
[2017-09-13 11:02:51] 1 miner threads started
[2017-09-13 11:02:51] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-09-13 11:02:51] Pool set diff to 1063
[2017-09-13 11:02:51] Stratum detected new block
[2017-09-13 11:02:52] GPU #1: allocating 2684354560 bytes (d_long_state)
[2017-09-13 11:02:57] GPU #1: GeForce GTX 1060 6GB, 199.75 H/s (199.74 H/s avg)

This is cold start with no GPU load

[2017-09-13 11:04:01] GPU #1: 8393588736 bytes free, 8506769408 bytes total
[2017-09-13 11:04:01] GPU #1: 8393588736 bytes free, 8506769408 bytes total
[2017-09-13 11:04:01] GPU #2: 8393588736 bytes free, 8506769408 bytes total
[2017-09-13 11:04:01] GPU #3: 8393588736 bytes free, 8506769408 bytes total
[2017-09-13 11:04:01] 1 miner threads started
[2017-09-13 11:04:01] GPU #1: GeForce GTX 1060 6GB (10 SMX), using 40 blocks of 64 threads
[2017-09-13 11:04:01] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-09-13 11:04:01] GPU #1: allocating 1073741824 bytes (d_long_state)
[2017-09-13 11:04:01] Pool set diff to 1063
[2017-09-13 11:04:01] Stratum detected new block

GPU 1: an illegal memory access was encountered
cryptonight/cuda_cryptonight_core.cu line 246

Look 8393588736 bytes it is like 8G am I correct? But my GPU is 6G. Looks like this miner could not detect correct GPU memory if different GPUs (with different memory) are installed on the same rig.

rumatoest commented 7 years ago

BTW I can see here 2 gpu with same index #1 Miner detects my 1060 GPU at index 1, but it is wrong, because according to nvidia-smi 1060 has #0 index At #0 index miner detects 1070 wich is #1 accoding to nvidia-smi I can guess that there is some meta data mix between #0 and #1 GPUS I would try to swap my first and last GPU in order to make 1060 at index #3 and check it again.

rumatoest commented 7 years ago

I swap GPUs. Nothing changed but GPU indexes does. This is when it works with ethereum miner

[2017-09-13 13:35:23] GPU #3: 5980094464 bytes free, 8508145664 bytes total
[2017-09-13 13:35:23] GPU #1: 5980094464 bytes free, 8508145664 bytes total
[2017-09-13 13:35:23] GPU #2: 5980094464 bytes free, 8508145664 bytes total
[2017-09-13 13:35:23] GPU #3: 5980094464 bytes free, 8508145664 bytes total

Do not start

[2017-09-13 13:36:18] GPU #3: 8394964992 bytes free, 8508145664 bytes total
[2017-09-13 13:36:18] GPU #1: 8394964992 bytes free, 8508145664 bytes total
[2017-09-13 13:36:18] GPU #2: 8394964992 bytes free, 8508145664 bytes total
[2017-09-13 13:36:18] GPU #3: 8394964992 bytes free, 8508145664 bytes total

As you can see now it show two #3 devices with same amount of memory. But it has to show 0,1,2 devices as 1070 with 8G and 3 as 1060 with 6G.

KlausT commented 7 years ago

Looks like we have at least two different bugs.

KlausT commented 7 years ago

ok, two bugs have been fixed.

rumatoest commented 7 years ago

Now it is fail to compile from master :(

Makefile:1152: recipe for target 'cryptonight/cuda_cryptonight_extra.o' failed
make[2]: *** [cryptonight/cuda_cryptonight_extra.o] Error 137
make[2]: Leaving directory '/home/miner/miners/ccm-dev'
Makefile:726: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/miner/miners/ccm-dev'
Makefile:396: recipe for target 'all' failed
make: *** [all] Error 2

I could not guess what does " Error 137" stands for

paulomalvar commented 7 years ago

It's compiling just fine for me. After pulling latest changes to master, I did:

make clean ; ./autogen.sh ; ./configure ; make

Ah, and it now runs fine under Linux (Ubuntu Mate 16.04).

rumatoest commented 7 years ago

Maybe a have an issue with my environment. It is working now. Miner start without issues. GPU ids are unique BUT still shifted

[2017-09-13 21:45:42] GPU #3: GeForce GTX 1070 (15 SMX), using 60 blocks of 32 threads
[2017-09-13 21:45:42] GPU #2: GeForce GTX 1070 (15 SMX), using 60 blocks of 32 threads
[2017-09-13 21:45:42] GPU #1: GeForce GTX 1060 6GB (10 SMX), using 40 blocks of 32 threads
[2017-09-13 21:45:42] GPU #0: GeForce GTX 1070 (15 SMX), using 60 blocks of 32 threads

nvidia-smi outputs

|   0  GeForce GTX 106...  On   | 00000000:01:00.0 Off |                  N/A |
| 73%   50C    P0    30W /  75W |      8MiB /  6072MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1070    On   | 00000000:02:00.0 Off |                  N/A |
| 74%   52C    P0    42W / 105W |      9MiB /  8112MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1070    On   | 00000000:03:00.0 Off |                  N/A |
| 74%   53C    P0    44W / 105W |      9MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 1070    On   | 00000000:04:00.0 Off |                  N/A |
| 66%   49C    P0    46W / 105W |      9MiB /  8114MiB |      0%      Default |

So it is move 1060 to #1 index and shift last GPU to #0

Himan2001 commented 7 years ago

I confirm this problem for any "Intensity" for Zotac 1060 3GB on Cuda 9 RC3/Linux/Ubuntu. @KlausT : It is possible, that you take a lookover the code again and eventually check it under LINUX/VM ? I made another cross-test and compared Monero with SUMOKOIN, on Monero i don´t get this crash. Maybe the cuda-code itself has a problem with correct finding of available RAM on the cards ?

KlausT commented 7 years ago

About 9 days ago I tried to fix this. How old is your version?

Himan2001 commented 7 years ago

yesterday again compiled from latest code with cuda9/ubuntu 14.04. Was wondering why it runs with XMR and fails with SUMO...

KlausT commented 7 years ago

What do you mean with "fails" ? Did you try to solo mine? It's not supported. screenshot

Himan2001 commented 7 years ago

It failed with this like on 3GB Cards: GPU 1: an illegal memory access was encountered cryptonight/cuda_cryptonight_core.cu line 246 Later i checked with ccminer 2.2.1, here it works without a problem on SUMO

KlausT commented 7 years ago

version 2.05 ?

Himan2001 commented 7 years ago

yes latest version.

Himan2001 commented 7 years ago

The problem comes up only with SUMO on a yiimp-pool. XMR i short checked only with XMR-nanopool. After getting the share and difficulty from pool instantly the miner is crashing with the above error message. No matter which Intensity i used for testing :(

KlausT commented 7 years ago

Would you please try the memdebug branch and tell me what values for the memory you see? https://github.com/KlausT/ccminer-cryptonight/tree/memdebug

Himan2001 commented 7 years ago

So... here we go:

ccminer-cryptonight 2.05 (64 bit) for nVidia GPUs by tsiv and KlausT Built with GCC 4.8 using the Nvidia CUDA Toolkit 9.0

[2017-09-22 20:05:22] Keepalive actived [2017-09-22 20:05:22] GPU #0: 3069378560 bytes free, 3160276992 bytes total [2017-09-22 20:05:22] GPU #1: 3069378560 bytes free, 3160276992 bytes total [2017-09-22 20:05:22] GPU #2: 3069378560 bytes free, 3160276992 bytes total [2017-09-22 20:05:22] GPU #3: 3069378560 bytes free, 3160276992 bytes total [2017-09-22 20:05:23] Starting Stratum on stratum+tcp://192.168.5.230:4445 [2017-09-22 20:05:23] 4 miner threads started [2017-09-22 20:05:23] GPU #1: GeForce GTX 1060 3GB (9 SMX), using 36 blocks of 32 threads [2017-09-22 20:05:23] GPU #0: GeForce GTX 1060 3GB (9 SMX), using 36 blocks of 32 threads [2017-09-22 20:05:23] GPU #2: GeForce GTX 1060 3GB (9 SMX), using 36 blocks of 32 threads [2017-09-22 20:05:23] GPU #3: GeForce GTX 1060 3GB (9 SMX), using 36 blocks of 32 threads [2017-09-22 20:05:23] Pool set diff to 20000 [2017-09-22 20:05:23] Stratum detected new block [2017-09-22 20:05:24] GPU #1: allocating 2415919104 bytes (d_long_state) [2017-09-22 20:05:24] GPU #0: allocating 2415919104 bytes (d_long_state) [2017-09-22 20:05:24] GPU #2: allocating 2415919104 bytes (d_long_state) [2017-09-22 20:05:24] GPU #3: allocating 2415919104 bytes (d_long_state)

GPU 1: an illegal memory access was encountered cryptonight/cuda_cryptonight_extra.cu line 215

GPU 3: an illegal memory access was encountered cryptonight/cuda_cryptonight_core.cu line 253

and a dmesg:

[283191.912617] NVRM: Xid (PCI:0000:03:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 0): Out Of Range Address

[283191.912627] NVRM: Xid (PCI:0000:03:00): 13, Graphics SM Global Exception on (GPC 0, TPC 0): Physical Multiple Warp Errors

[283191.912633] NVRM: Xid (PCI:0000:03:00): 13, Graphics Exception: ESR 0x504648=0x102000e 0x504650=0x4 0x504644=0xd3eff2 0x50464c=0x17f

[283191.912704] NVRM: Xid (PCI:0000:03:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 1): Out Of Range Address

[283191.912711] NVRM: Xid (PCI:0000:03:00): 13, Graphics Exception: ESR 0x504e48=0x102000e 0x504e50=0x20 0x504e44=0xd3eff2 0x504e4c=0x17f

[283191.912781] NVRM: Xid (PCI:0000:03:00): 13, Graphics SM Warp Exception on (GPC 0, TPC 2): Out Of Range Address

[283191.912787] NVRM: Xid (PCI:0000:03:00): 13, Graphics SM Global Exception on (GPC 0, TPC 2): Physical Multiple Warp Errors

[283191.912793] NVRM: Xid (PCI:0000:03:00): 13, Graphics Exception: ESR 0x505648=0x102000e 0x505650=0x24 0x505644=0xd3eff2 0x50564c=0x17f

Every time i start the miner up again, another PCI-ID shows up, so it´s every time another card with exactly the same error. And it´s no overclock problem - 4 exactly identical GTX1060 3 GB cards.

mcrosson commented 7 years ago

I'm seeing the same error on a GTX 1030.

I tried building the memdebug branch but on CUDA 9.0 Ubuntu 16.04 it's throwing an error. I've attached logs/output to this comment.

Any help would be greatly appreciated.

Please note this worked with CUDA 8.0 Ubuntu 16.04 about 14 days ago. I ran into trouble only after upgrading to CUDA 9.0 on Ubuntu 16.04.

memdebug_compile_error.txt master_error.txt

KlausT commented 7 years ago

CUDA 9 doesn't support sm_20 anymore. That's the reason for the compile error. I have updated the memdebug branch now. With CUDA 9 I get an illegal memory access, while CUDA 8 works without problem. I don't know why yet.

mcrosson commented 7 years ago

Thank you for such a quick turn around. Is there anything you need from me that might be helpful?

mcrosson commented 7 years ago

Quick update: I was able to build the latest memdebug (revision a8141db7db1f8769e322a66a418d91b9d7ffd162) and received the following output. What's interesting is I have 2 GPUs in this rig. A GTX 1060 6Gb and a GTX 1030 2Gb. I'm currently running ccminer with just the GTX 1030 2Gb (-d 1 cli option) but the output would seem to indicate it's seeing both cards as GPU 1.


    *** ccminer-cryptonight 2.05 (64 bit) for nVidia GPUs by tsiv and KlausT
    *** Built with GCC 6.3 using the Nvidia CUDA Toolkit 9.0

 tsiv's BTC donation address:   1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs
 KlausT's BTC donation address: 1QHH2dibyYL5iyMDk3UN4PVvFVtrWD8QKp
 for more donation addresses please read the README.txt
-----------------------------------------------------------------
[2017-09-28 13:57:41] GPU #1: 5686165504 bytes free, 6367739904 bytes total
[2017-09-28 13:57:41] GPU #1: 2054225920 bytes free, 2097217536 bytes total
[2017-09-28 13:57:41] 1 miner threads started
[2017-09-28 13:57:41] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-09-28 13:57:41] GPU #1: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-09-28 13:57:41] Pool set diff to 1063
[2017-09-28 13:57:41] Stratum detected new block
[2017-09-28 13:57:41] GPU #1: allocating 1610612736 bytes (d_long_state)

GPU 1: an illegal memory access was encountered
cryptonight/cuda_cryptonight_extra.cu line 219

ktamas77 commented 7 years ago

I have the same issue. Geforce GTX 1070 8GB, Ubuntu 16.04, CUDA 9.0, Monero mining

ccminer -q -o stratum+tcp://xmr-usa.dwarfpool.com:8050 -u <wallet> -p x
    *** ccminer-cryptonight 2.05 (64 bit) for nVidia GPUs by tsiv and KlausT 
    *** Built with GCC 5.4 using the Nvidia CUDA Toolkit 9.0

 tsiv's BTC donation address:   1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs
 KlausT's BTC donation address: 1QHH2dibyYL5iyMDk3UN4PVvFVtrWD8QKp
 for more donation addresses please read the README.txt
-----------------------------------------------------------------
[2017-09-30 18:21:12] 1 miner threads started
[2017-09-30 18:21:12] Starting Stratum on stratum+tcp://xmr-usa.dwarfpool.com:8050
[2017-09-30 18:21:12] GPU #0: GeForce GTX 1070 (15 SMX), using 60 blocks of 32 threads
[2017-09-30 18:21:12] Pool set diff to 50000.2
[2017-09-30 18:21:12] Stratum detected new block

GPU 0: an illegal memory access was encountered
cryptonight/cuda_cryptonight_extra.cu line 217

mcrosson commented 7 years ago

I'm still receiving this error on the latest (a8141db7db1f8769e322a66a418d91b9d7ffd162) memdebug branch (after CUDA 9 merge). I managed to patch the sources for additional debug output and got the following. The relevant code that isolates the "exact" crash is below the log output. Does this help you narrow it down any? Is there more detail I can provide?

A quick google search suggested it's in the kernel but I have no idea what that means or where to start poking around next in the code base.

If you've got a 'max debug output' branch (or code) with more logging I can try running that to shed insight as well.

Log

    *** ccminer-cryptonight 2.05 (64 bit) for nVidia GPUs by tsiv and KlausT 
    *** Built with GCC 5.4 using the Nvidia CUDA Toolkit 9.0

 tsiv's BTC donation address:   1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs
 KlausT's BTC donation address: 1QHH2dibyYL5iyMDk3UN4PVvFVtrWD8QKp
 for more donation addresses please read the README.txt
-----------------------------------------------------------------
[2017-10-18 03:54:47] cryptonight.cu cuda_num_devices()
[2017-10-18 03:54:47] cryptonight.cu cuda_deviceinfo(int GPU_N)
[2017-10-18 03:54:47] cryptonight.cu cuda_set_device_config(int GPU_N)
[2017-10-18 03:54:47] GPU #1: 5686165504 bytes free, 6367739904 bytes total
[2017-10-18 03:54:47] GPU #2: 2054225920 bytes free, 2097217536 bytes total
[2017-10-18 03:54:48] GPU #3: 2052849664 bytes free, 2095841280 bytes total
[2017-10-18 03:54:48] GPU #3: 2054225920 bytes free, 2097217536 bytes total
[2017-10-18 03:54:48] GPU #4: 2054225920 bytes free, 2097217536 bytes total
[2017-10-18 03:54:48] 3 miner threads started
[2017-10-18 03:54:48] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-10-18 03:54:48] GPU #1: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-18 03:54:48] GPU #1: GeForce GT 1030, continue with old work
[2017-10-18 03:54:48] GPU #1: GeForce GT 1030, startnonce $00000000, endnonce $00000200
[2017-10-18 03:54:48] cryptonight.cu scanhash_cryptonight(lots of args)
[2017-10-18 03:54:48] GPU #3: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-18 03:54:48] GPU #3: GeForce GT 1030, continue with old work
[2017-10-18 03:54:48] GPU #3: GeForce GT 1030, startnonce $00000000, endnonce $00000200
[2017-10-18 03:54:48] cryptonight.cu scanhash_cryptonight(lots of args)
[2017-10-18 03:54:48] GPU #2: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-18 03:54:48] GPU #2: GeForce GT 1030, continue with old work
[2017-10-18 03:54:48] GPU #2: GeForce GT 1030, startnonce $00000000, endnonce $00000200
[2017-10-18 03:54:48] cryptonight.cu scanhash_cryptonight(lots of args)
[2017-10-18 03:54:48] Auth id: db74cf4b-347e-4f37-be6f-e753fc5f359e
[2017-10-18 03:54:48] Pool set diff to 1063
[2017-10-18 03:54:48] Stratum detected new block
[2017-10-18 03:54:49] GPU #1: allocating 1610612736 bytes (d_long_state)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_init(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra.cu cryptonight_extra_cpu_setData(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_prepare(args)
[2017-10-18 03:54:49] GPU #3: allocating 1610612736 bytes (d_long_state)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_init(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra.cu cryptonight_extra_cpu_setData(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_prepare(args)
[2017-10-18 03:54:49] GPU #2: allocating 1610612736 bytes (d_long_state)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_init(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra.cu cryptonight_extra_cpu_setData(args)
[2017-10-18 03:54:49] cuda_cryptonight_extra cryptonight_extra_cpu_prepare(args)
[2017-10-18 03:54:54] GPU #2: calling cryptonight_extra_cpu_final
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args)
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 1
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 2
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 3
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 4
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 5
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) ---- thr_id 1
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) ---- threads 768
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 6
[2017-10-18 03:54:54] cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 7

GPU 2: an illegal memory access was encountered
cryptonight/cuda_cryptonight_extra.cu line 245

Crashing Code/Method

__host__ void cryptonight_extra_cpu_final(int thr_id, int threads, uint32_t startNonce, uint32_t *resnonce, uint32_t *d_ctx_state)
{
    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args)");
    int threadsperblock = 128;

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 1");

    dim3 grid((threads + threadsperblock - 1) / threadsperblock);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 2");

    dim3 block(threadsperblock);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 3");

    exit_if_cudaerror(thr_id, __FILE__, __LINE__);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 4");

    cudaDeviceSynchronize();

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 5");
    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) ---- thr_id %d", thr_id);
    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) ---- threads %d", threads);

    cryptonight_extra_gpu_final << <grid, block >> >(threads, startNonce, d_target[thr_id], d_resultNonce[thr_id], d_ctx_state);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 6");

    cudaDeviceSynchronize();

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 7");

    exit_if_cudaerror(thr_id, __FILE__, __LINE__);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 8");

    cudaMemcpy(resnonce, d_resultNonce[thr_id], 2 * sizeof(uint32_t), cudaMemcpyDeviceToHost);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - 9");

    exit_if_cudaerror(thr_id, __FILE__, __LINE__);

    applog(LOG_DEBUG, "cuda_cryptonight_extra cryptonight_extra_cpu_final(args) - END");
}

KlausT commented 7 years ago

This doesn't look right:

[2017-10-18 03:54:47] GPU #1: 5686165504 bytes free, 6367739904 bytes total
[2017-10-18 03:54:47] GPU #2: 2054225920 bytes free, 2097217536 bytes total
[2017-10-18 03:54:48] GPU #3: 2052849664 bytes free, 2095841280 bytes total
[2017-10-18 03:54:48] GPU #3: 2054225920 bytes free, 2097217536 bytes total
[2017-10-18 03:54:48] GPU #4: 2054225920 bytes free, 2097217536 bytes total

How many cards do you use? Are you using the -d option?

mcrosson commented 7 years ago

I am using the -d argument (full CLI options below as well as nvidia-smi output). These graphics cards do work with your standard ccminer sources, if that helps.

**CLI***

./ccminer-cryptonight/ccminer -o stratum+tcp://xdn-xmr.pool.minergate.com:45790 \
    -u user@site.com \
    -p x --color -d 1,2,3 -D

**nvidia-smi***

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 1030     Off  | 00000000:01:00.0 Off |                  N/A |
| 63%   65C    P0   ERR! /  30W |     45MiB /  2000MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 1030     Off  | 00000000:03:00.0 Off |                  N/A |
| 54%   69C    P0   ERR! /  30W |     45MiB /  1998MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GT 1030     Off  | 00000000:04:00.0 Off |                  N/A |
| 61%   64C    P0   ERR! /  30W |     45MiB /  2000MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 960     Off  | 00000000:07:00.0 Off |                  N/A |
| 27%   73C    P2   102W / 128W |    555MiB /  1996MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 106...  Off  | 00000000:08:00.0 Off |                  N/A |
| 33%   71C    P2   145W / 150W |    652MiB /  6072MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+

KlausT commented 7 years ago

I have found a bug. I hope that was the reason for the crash.

Edit: No, it's crashing with CUDA 9. CUDA 8 works fine. Wierd.

mcrosson commented 7 years ago

I was just about to leave a comment to that effect. I've attached my latest run of without the -d option

    *** ccminer-cryptonight 2.05 (64 bit) for nVidia GPUs by tsiv and KlausT 
    *** Built with GCC 5.4 using the Nvidia CUDA Toolkit 9.0

 tsiv's BTC donation address:   1JHDKp59t1RhHFXsTw2UQpR3F9BBz3R3cs
 KlausT's BTC donation address: 1QHH2dibyYL5iyMDk3UN4PVvFVtrWD8QKp
 for more donation addresses please read the README.txt
-----------------------------------------------------------------
[2017-10-19 22:37:59] 5 CUDA devices detected
[2017-10-19 22:38:00] Starting Stratum on stratum+tcp://xdn-xmr.pool.minergate.com:45790
[2017-10-19 22:38:00] 5 miner threads started
[2017-10-19 22:38:00] GPU #3: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-19 22:38:00] GPU #2: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-19 22:38:00] GPU #4: GeForce GTX 960 (8 SMX), using 32 blocks of 16 threads
[2017-10-19 22:38:00] GPU #1: GeForce GT 1030 (3 SMX), using 12 blocks of 64 threads
[2017-10-19 22:38:00] GPU #0: GeForce GTX 1060 6GB (10 SMX), using 40 blocks of 32 threads
[2017-10-19 22:38:00] Pool set diff to 1063
[2017-10-19 22:38:00] Stratum detected new block
[2017-10-19 22:38:03] GPU #4: GeForce GTX 960, 168.30 H/s (168.22 H/s avg)

GPU 0: an illegal memory access was encountered
cryptonight/cuda_cryptonight_extra.cu line 217

KlausT commented 7 years ago

I think this change will fix it: https://github.com/KlausT/ccminer-cryptonight/commit/6863dbef86d5b88e0ac9bc333cbaccff6d347e6d But I have no idea why.

mcrosson commented 7 years ago

That did the trick. I just built the latest memdebug branch with the referenced commit and things are working 100% :grin:

Thank you for all your hard work.

Luxian commented 7 years ago

I can confirm that 6863dbef86d5b88e0ac9bc333cbaccff6d347e6d fixed the issue for me as well.

GTX 1050
CUDA 9
xubuntu 16.04

Thanks!

KlausT / ccminer-cryptonight

Illegal memory access on GTX 1060 6g #50