xmrig / xmrig-nvidia

Monero (XMR) NVIDIA miner
GNU General Public License v3.0
705 stars 262 forks source link

Cuda: unknow error cuda_get_deviceinfo on line 535 #245

Open berezinevgeniy opened 5 years ago

berezinevgeniy commented 5 years ago

i have error message on start on GeeForce 610-620, 720 and xmrig-nvidia 2.13+with drivers 376.71-391.35 (latest for me)

GPU 0: unknown error cuda_get_deviceinfo line 535 [2019-03-09 00:09:29] Setup failed for GPU 0. Exitting.

xmrig Cuda 9 has the same problem. 2.8.3+ works fine

xmrig commented 5 years ago

You should use CUDA 8.0 version, Fermi (compute capability 2.1) architecture not supported by CUDA 9.0. Thank you.

berezinevgeniy commented 5 years ago

have the same error on cuda 8 -(

GPU 0: unknown error cuda_get_deviceinfo line 535 [2019-03-09 01:22:30] Setup failed for GPU 0. Exitting.

with 2.8.3 cuda 8 have no problems

berezinevgeniy commented 5 years ago

GF 740M works fine with 2.14. Only GF 6(7,8)10-20M series fails win new xmrig. How can i find error code from Cuda? I Can not compile new binnary with comment this part of code (cuda_get_deviceinfo). can you help me?

Lonelysoul-HayashiNoMirai commented 5 years ago

I have the same problem too :((. Plz help.....

xmrig commented 5 years ago

@berezinevgeniy Can you provide the config too? currently I have no idea what cause this issue, simply calling cudaGetDeviceProperties failed with unknown error, but previous call was successful because GPU name and other information detected correctly. Thank you.

berezinevgeniy commented 5 years ago

it seems to unsupported cuda. The latest drivers for me is 391.35. And all of my problem videos are on Fermi, that is not supperted by Nvidia any more. All other videos that i have is on 417+ drivers and has no problem (after 2.14.1 xmrig-nvidia)

P.S. Xmr-stak have the similar error on start with Fermi GeForces after 2.10.0 update. And works with early version as a xmrig. [2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend_cuda10_0' WARNING: NVIDIA Insufficient driver! WARNING: NVIDIA no device found [2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend_cuda9_2' WARNING: NVIDIA cannot load backend library: xmrstak_cuda_backend_cuda9_2.dll WARNING: NVIDIA Insufficient driver! WARNING: NVIDIA no device found [2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend' NVIDIA: found 1 potential device's [2019-03-10 18:56:01] : Starting NVIDIA GPU thread 0, affinity: 36761328. WARNING: Invalid device ID '36761376'! [2019-03-10 18:56:01] : Setup failed for GPU 0. Exiting.

It seems something wrong with device id.

I think, need to get debug info with this error. We have no full description what happend with cudaGetDeviceProperties. Usually unknow error has detail description. If you like, i can give you TeamViwer acces to pc with Fermi nvidia to test or you can make compile test miner with no double cudaGetDeviceProperties run (use result from previous exec this function if it works as you see) -)

Config.json { "algo": "cryptonight", "api": { "port": 0, "access-token": null, "id": null, "worker-id": null, "ipv6": false, "restricted": true }, "background": false, "colors": true, "cuda-bfactor": 6, "cuda-bsleep": 25, "cuda-max-threads": 64, "donate-level": 5, "log-file": "c:\xmrig\log.txt", "pools": [ { "url": "xxx", "user": "xxx", "pass": "x", "rig-id": null, "nicehash": false, "keepalive": false, "variant": -1, "tls": false, "tls-fingerprint": null } ], "print-time": 60, "retries": 5, "retry-pause": 5, "threads": [ { "index": 0, "threads": 64,
"blocks": 4, "bfactor": 8, "bsleep": 25, "sync_mode": 3, "affine_to_cpu": false } ], "user-agent": null, "syslog": false, "watch": false }

berezinevgeniy commented 5 years ago

This builds (Cuda8 version download) from xmr-stak is working with Fermi. This is better then nothing, I belive that you help with it. Xmrig is faster, simple, stable.

Spudz76 commented 5 years ago

I modified the failure exit and also meta-miner so it will relaunch until it works. If it doesn't work, use a larger hammer...

Sometimes its like 10 times per success. But it works OK for now. Clocking or not has no difference on the init crash. I did not test underclocking though. When it does run there are no invalids nor kernel crashes so I'm pretty sure the clocking is fine.

Also confirm somehow xmr-stak does not do this at all ever (but its init code is basically identical?) Only real difference is they load their backend as a DLL, while here it's linked static into the main exe. Maybe chain-loading DLL to DLL just works better than the static launch for some reason? I never quite understood why xmr-stak refuses to static link (it "should" work "identical") but a weird unexplained side effect like this might be why? Also it's more of a CPU miner with GPU plugins so having a DLL plugin makes sense there (for runtime full disable of the GPU backends) but it seems like there are more reasons than just that.

>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5108MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection

GPU 0: unknown error
cuda_get_deviceinfo line 536
[2019-03-18 12:01:30] Setup failed for GPU 0. Exiting.
!!! Miner socket error
!!! Pool (gulf.moneroocean.stream:ssl443) <-> miner link was broken due to miner socket error
!!! Miner './xmrig-nvidia80 --config=config-r.json' exited with nonzero code 1
>>> Restarting './xmrig-nvidia80 --config=config-r.json' miner that was closed unexpectedly
>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5116MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection

GPU 0: unknown error
cuda_get_deviceinfo line 536
[2019-03-18 12:01:32] Setup failed for GPU 0. Exiting.
!!! Miner socket error
!!! Pool (gulf.moneroocean.stream:ssl443) <-> miner link was broken due to miner socket error
!!! Miner './xmrig-nvidia80 --config=config-r.json' exited with nonzero code 1
>>> Restarting './xmrig-nvidia80 --config=config-r.json' miner that was closed unexpectedly
>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5119MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection
[2019-03-18 12:01:34] use pool 127.0.0.1:3334  127.0.0.1
[2019-03-18 12:01:34] new job from 127.0.0.1:3334 diff 845 algo cn/r height 1793551
[2019-03-18 12:02:01] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:01]  * GPU #0: 81C FAN 0%
[2019-03-18 12:02:03] accepted (1/0) diff 845 (63 ms)
[2019-03-18 12:02:17] accepted (2/0) diff 845 (69 ms)
[2019-03-18 12:02:25] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:25]  * GPU #0: 82C FAN 0%
[2019-03-18 12:02:49] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:49]  * GPU #0: 83C FAN 0%
[2019-03-18 12:03:13] speed 10s/60s/15m n/a 29.3 n/a H/s max n/a H/s
[2019-03-18 12:03:13]  * GPU #0: 83C FAN 0%
[2019-03-18 12:03:25] accepted (3/0) diff 845 (61 ms)
[2019-03-18 12:03:37] speed 10s/60s/15m n/a 29.4 n/a H/s max n/a H/s
[2019-03-18 12:03:37]  * GPU #0: 83C FAN 0%
[2019-03-18 12:04:01] speed 10s/60s/15m n/a 29.6 n/a H/s max n/a H/s
Spudz76 commented 5 years ago

Also dell laptop thus the no fan reporting (NVidiaInspector just greys that section out / not available) So that is "normal". It might be nice for it to disappear when unsupported such as the power-usage.

Spudz76 commented 5 years ago

Also cn-heavy only works at like 4x4 which is low occupancy and slow (like 25% of ideal probably) everything else larger hits memory allocation failures at kernel init (once the init error has been brute forced)

Most algos work nice once hand-tuned. All Fermi autotuning is not even close though on most algos. They seem to enjoy 5 times SMX (10) threads and then adjust blocks until memory allocation doesn't fail.

Spudz76 commented 5 years ago

(The MEM: part of the detection line is a hack I'm working on obviously doesn't work yet)

Spudz76 commented 5 years ago

PR #255 fixes this cuda_get_deviceinfo init-crash

although nobody really knows why

Comes with the memory reporting too, and support for -DCUDA_ARCH=21 by itself (2% faster on mine vs "20" code)

Spudz76 commented 5 years ago

@berezinevgeniy don't forget to come back sometime and check this thread