fireice-uk / xmr-stak-nvidia

Monero NVIDIA miner
GNU General Public License v3.0
249 stars 99 forks source link

GPU 0: unspecified launch failure (GTX 960, Windows 10) #120

Open KatsuoGuro opened 6 years ago

KatsuoGuro commented 6 years ago

Fairly new to mining here. My config has been edited to the proper threading and blocking, etc. and I have entered my pool and wallet addresses. I get the following error message every time I run the .exe:

C:/Users/MAIN/Documents/Visual Studio 2015/Projects/xmr-stak-nvidia/xmr-stak-nvidia/nvcc_code/cuda_extra.cu line 246

The listed folder does not exist on my computer from what I have searched. I have Visual Studio 2017 installed. Do I need VS 2015? If so where do I find it? Any help is appreciated.

psychocrypt commented 6 years ago

Please post your config. Do you used the config suggested by the miner? If not please start with a new config and use the config suggested by the miner.

rmarder commented 6 years ago

I confirm the same issue here. I also have a GTX 960 on Win10 x64

The recommended config the program is telling me to use is:

**** Copy&Paste ****

"gpu_threads_conf" : [ { "index" : 0, "threads" : 32, "blocks" : 24, "bfactor" : 6, "bsleep" : 25, "affine_to_cpu" : false, }, ],

Attempting to use this config fails with the same error for me as reported by the OP of this bug.

This is using the latest published release of xmr-stak-nvidia-notls

I am able to use xmr-stak-cpu on this system without issue, and I am able to use a different gpu miner, the very old ccminer 2014 release, on this same system without any problem.

psychocrypt commented 6 years ago

Please reduce the threads or blocks.

rmarder commented 6 years ago

I reduced the threads from 32 to 28 and that prevented the program from crashing, however 28 threads yields a very poor hashrate for me.

I've been experimenting with the threads and it seems that threads 12 is the sweet spot for a gtx 960.

In any case, the default recommended setting of threads 32 is far too high as it causes the miner to crash.

I haven't had a chance to mess with blocks yet.

psychocrypt commented 6 years ago

The reason is that windows takes to much memory and there is a amount of memory uses for a kernel which can not calculated before the start. I am still working on it but currently have no idea how I can solve it.

jwy4tt commented 6 years ago

I have the same error running 2 x GTX 1080ti on Win10 x64

Config:

"gpu_threads_conf" : [ { "index" : 0, "threads" : 64, "blocks" : 45, "bfactor" : 6, "bsleep" : 25, "affine_to_cpu" : false, }, { "index" : 1, "threads" : 64, "blocks" : 45, "bfactor" : 6, "bsleep" : 25, "affine_to_cpu" : false, }, ],

works fine for the PC with 2 x GTX 1070's

Any suggestions on Threads to Blocks configuration?

Thanks in advance

psychocrypt commented 6 years ago

reduce threads or blocks.

Am 19.09.2017 11:10 schrieb "jwy4tt" notifications@github.com:

I have the same error running 2 x GTX 1080ti on Win10 x64

Config:

"gpu_threads_conf" : [ { "index" : 0, "threads" : 64, "blocks" : 45, "bfactor" : 6, "bsleep" : 25, "affine_to_cpu" : false, }, { "index" : 1, "threads" : 64, "blocks" : 45, "bfactor" : 6, "bsleep" : 25, "affine_to_cpu" : false, }, ],

works fine for the PC with 2 x GTX 1070's

Any suggestions on Threads to Blocks configuration?

Thanks in advance

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fireice-uk/xmr-stak-nvidia/issues/120#issuecomment-330478651, or mute the thread https://github.com/notifications/unsubscribe-auth/AYsxtvPnc6owZA5RofuhxaaLIq-xbPQrks5sj4UAgaJpZM4PMCR2 .

DrMaqueo commented 6 years ago

I tried several configurations, and maintaining threads and blocks but increasing bfactor = 8 both solved the problem and provided better hashrate.

xuxuedong commented 6 years ago

i have the same problem, thanks a lot for your advice

ilpancrazio commented 6 years ago

Hello, i'd like to found an optimal setting for my GTX 960 2gb as well. I've gathered some data and found out the most H/s comes from 32 threads x 16 block. Block value of 16 is a multiple of 8, which is the number of multiprocessors SMX for this model. I'm taking in account that using threads values over 32 per constant 16 block would slightly decrease the performance and eventually crash the miner (maybe past 38- 40 threads i guess). It's possible to play around those numbers, let's say TxB is "kind of optimal" for 32x16= 512 let say that there's a "limit" >= 512, getting over that, the miner would crash. The GTX 960 has 2 GPC, both with 4 multiprocessors and 128 cuda cores per multiprocessor. Then 128x4x2 = 1024 cores. Also i've read somewhere that the block value should be a multple of the SMX (8) so that (block value) mod (8) should be == 0. It means you may set block = 8, 16, 24, 32 and so on. I finally tweak around with those values using this sets: TxB =512 and 512/ any multiple of 8.. the results are pretty much the same in terms of hashrate.

Another weird thing is when i set bfactor to 6 it seems to ?improve? performance a little bit? I'm running the gtx 960 off from any graphic tasks (standing alone, and piloting the UI through the intel HDgraphic iGPU instead). So the bfactor and bsleep are both 0 (so this bfactor = 6 thing sound pretty weird to me).

Hope this would help to find some other valuable knowledge about GTX960 holder and miner devs too.

By the way... how do i've to manage the sync_mode method? is =3 by default but what with the other values 0,1,2 ? Forgive me i'm no used to that stuff generally. So bear with me thank you.

nelsonyepez88 commented 6 years ago

So what settings did you stick with and what OC settings did you use? I'm using 32x16 with OC set at 1500MHZ core and 3004MHZ men. I'm getting 267 h/s.

ilpancrazio commented 6 years ago

@nelsonyepez88 I've a MSI geforce GTX 960 Gaming 2G, the OC is tuned by factory setting, and accordingly to MSI Afterburner monitor i've 1392 mHz core and 3004 mhz memclock, which looks pretty balanced imo. Then, here is my config, i'm using latest release of xmr stak unified: `"gpu_threads_conf" : [ // gpu: GeForce GTX 960 architecture: 52 // memory: 1842/2048 MiB // smx: 8 // block 16 setting (cryptonight-lite) = Tds== 72 { "index" : 0, "threads" : 56, "blocks" : 32, "bfactor" : 8, "bsleep" : 0, "affine_to_cpu" : 0, "sync_mode" : 3, },

],` A couple of post above it has been adviced to set the "bfactor" value == 8 to solve an issue of memory usage for a gtx1080ti, therefore i followed that line.. and i managed to break the "limit" of 32 x 16 . Now is steady 591 - 592h/s with "cryptonight - lite". Instead, for regular cryptonight, things are more complicated. Even with "bfactor" =8 can't raise so much hashpower, but a little improvements:

"gpu_threads_conf" : [ // gpu: GeForce GTX 960 architecture: 52 // memory: 1842/2048 MiB // smx: 8 // --try also "threads" : 56, "blocks" : 16, { "index" : 0, "threads" : 22, "blocks" : 32, "bfactor" : 8, "bsleep" : 0, "affine_to_cpu" : 0, "sync_mode" : 3, },

],

`` Finally, the the gtx 960 is off to the graphic stuff, and standing alone, switched to integrated gpu, which do the trick. Hopefully this would be of any help.