Open Jimmy-Z opened 6 years ago
1 platform(s) found: === 0x0283b480 === name : NVIDIA CUDA vendor : NVIDIA Corporation profile : FULL_PROFILE version : OpenCL 1.2 CUDA 9.1.75 1 device(s) found: === 0x0283b890 === name : GeForce GT 730 vendor : NVIDIA Corporation version : OpenCL 1.1 CUDA C version : OpenCL C 1.1 max compute units : 2 max work group size : 1024 type : GPU available : yes compiler available : yes endian : little frequency : 1400 global memory : 2147483648 local memory : 49152
since you wanted it here, I was able to bruteforce the test of the msky no problem just slow as fuck i cancelled the mii bruteforce after it offset 1 but the out of resources problem seems to be fixed for that at least :)
I also tried this version on my 1060, it lowered the speed by like 10%, OCed i get about 700 M/s now I get 630 M/s with this test build
Hi! I got the "out of the resource" error running seedminer gpu but I'm quite out of the loop in the 3ds hacking scene, also I'm not really into these things... However, I'm glad to help you providing as much informations as possible!
What I got running your test build exe:
selected device GeForce GT 545 on platform NVIDIA CUDA
mbed TLS 2.7.0, AES-NI supported
self-test/benchmark mode
AES Key: 0d0b8bd02564dd0351d7e415e6f23f36
randomize source buffer using AES OFB
0.119 seconds for preparing test data, 562.03 MB/s
0.006 seconds for OpenCL compiling
0.031 seconds for data upload, 2195.25 MB/s
# sha1_16_test on 64 MB
0.047 seconds for OpenCL, 1419.57 MB/s
0.033 seconds for data download, 2059.82 MB/s
0.630 seconds for reference C(single thread), 106.49 MB/s
sha1_16_test: succeed
# aes_enc_128_test on 64 MB
0.339 seconds for OpenCL, 198.01 MB/s
0.019 seconds for data download, 3495.44 MB/s
0.202 seconds for reference C(single thread), 332.98 MB/s
aes_enc_128_test: succeed
# aes_dec_128_test on 64 MB
0.385 seconds for OpenCL, 174.09 MB/s
0.018 seconds for data download, 3667.35 MB/s
aes_dec_128_test: succeed
Premere un tasto per continuare . . .
seedminer gpu command output:
GPU selected
New3DS msed
LFCS : 0x3d835e8
msed3 est : 0x80c4e550
Error est : -3516
ID0 hash 0: 199aa39d36207269e63a7d4402b97d32
Hash total: 1
movable_part2.sed generation success
bfcl msky e835d803020000000000000050e5c480 199aa39d36207269e63a7d4402b97d32 00000000
selected device GeForce GT 545 on platform NVIDIA CUDA
0.011 seconds for OpenCL compiling
local work size: 1024
ocl_assert: ocl_brute.c, function ocl_brute_msky, line 383
clEnqueueReadBuffer(command_queue, mem_out, CL_TRUE, 0, sizeof(cl_uint), &out, 0, NULL, NULL)
error: out of resources
My current setup:
Microsoft Windows 10 (10.0) Professional 64-bit (Build 16299)
Intel i7 2600 @3.40GHz
14GB DDR3 RAM dual channel
DirectX Version 12.0
NVIDIA GeForce GT 545, 3 GB DDR3
GPU Manufacturer: Micro-Star International Co., Ltd. (MSI)
Driver version 390.77
API Direct3D version 11.2
144 CUDA Cores
Win32_VideoController DriverVersion = 23.21.13.9077
Win32_VideoController DriverDate = 01/23/2018
If you want me to test something, or if you need further informations, just let me know. :)
@A7F thanks but you should run that test build with that two test command I gave in the OP.
Operating System: Windows 10 Pro, 64-bit GPU: GeForce GT 750M GPU RAM: 2048 MB GDDR5 Driver version: 381.65
bfcl info: name : NVIDIA CUDA vendor : NVIDIA Corporation profile : FULL_PROFILE version : OpenCL 1.2 CUDA 8.0.0 1 device(s) found: === 0x00141430 === name : GeForce GT 750M vendor : NVIDIA Corporation version : OpenCL 1.2 CUDA C version : OpenCL C 1.2 max compute units : 2 max work group size : 1024 type : GPU available : yes compiler available : yes endian : little frequency : 967 global memory : 2147483648 local memory : 49152
py -3 seedminer_launcher3.py gpu: selected device GeForce GT 750M on platform NVIDIA CUDA 0.015 seconds for OpenCL compiling local work size: 1024 ocl_assert: ocl_brute.c, function ocl_brute_msky, line 383 clEnqueueReadBuffer(command_queue, mem_out, CL_TRUE, 0, sizeof(cl_uint), &out, 0, NULL, NULL) error: out of resources
bfcl msky c27164f2e0994db8000000007dd5c901 afcb0cc132bd2aeb8e0a6b6a841c51c0: selected device GeForce GT 750M on platform NVIDIA CUDA 0.290 seconds for OpenCL compiling local work size: 1024 got a hit: c27164f2e0994db82e3d14737dd5c901 24.48 seconds, 78.88 M/s
bfcl lfcs 00000007 0000 17f5c00d8b581e5e: How long should I expect this one to take? It hasn't thrown the "out of resources" error but it's taking a while.
bfcl info
1 platform(s) found:
=== 0x0011f270 ===
name : NVIDIA CUDA
vendor : NVIDIA Corporation
profile : FULL_PROFILE
version : OpenCL 1.2 CUDA 9.1.84
1 device(s) found:
=== 0x0011e8c0 ===
name : GeForce GT 545
vendor : NVIDIA Corporation
version : OpenCL 1.1 CUDA
C version : OpenCL C 1.1
max compute units : 3
max work group size : 1024
type : GPU
available : yes
compiler available : yes
endian : little
frequency : 1440
global memory : 3221225472
local memory : 49152
bfcl msky c27164f2e0994db8000000007dd5c901 afcb0cc132bd2aeb8e0a6b6a841c51c0
selected device GeForce GT 545 on platform NVIDIA CUDA
0.230 seconds for OpenCL compiling
local work size: 1024
got a hit: c27164f2e0994db82e3d14737dd5c901
38.61 seconds, 50.00 M/s
the first command doesn't say out of resource but only shows this:
selected device GeForce GT 545 on platform NVIDIA CUDA
0.003 seconds for OpenCL compiling
local work size: 1024
0
am I supposed to wait? Because it was something like 20min with that output
@A7F Sorry I should have add that if the test command runs a few seconds without "out of resources" error, it's safe to cancel it with ctrl-c.
this happen with OC'd GPU cards. Mine is GTX960 card with 4G GDDR5, OverClockable.
1 platform(s) found:
=== 0x007aa420 ===
name : NVIDIA CUDA
vendor : NVIDIA Corporation
profile : FULL_PROFILE
version : OpenCL 1.2 CUDA 9.1.84
1 device(s) found:
=== 0x007a97a0 ===
name : GeForce GTX 960
vendor : NVIDIA Corporation
version : OpenCL 1.2 CUDA
C version : OpenCL C 1.2
max compute units : 8
max work group size : 1024
type : GPU
available : yes
compiler available : yes
endian : little
frequency : 1253
global memory : 0
local memory : 49152
I could NOT accomplish my task with a nVidia 920m (yes, this is a laptopg GPU).
Windows 10 Home 64-bit 16 GB of DDR3 RAM 2048MB GDDR3 nVidia 384.94 384 CUDA Cores
The error is:
selected device GeForce 920M on platform NVIDIA CUDA
0.018 seconds for OpenCL compiling
local work size: 1024
ocl_assert: ocl_brute.c, function ocl_brute_msky, line 383
clEnqueueReadBuffer(command_queue, mem_out, CL_TRUE, 0, sizeof(cl_uint), &out, 0, NULL, NULL)
error: out of resources
So, I have downloaded the test build and issued a few commands:
> bfcl
selected device GeForce 920M on platform NVIDIA CUDA
mbed TLS 2.7.0, AES-NI supported
self-test/benchmark mode
AES Key: 0d0b8bd02564dd0351d7e415e6f23f36
randomize source buffer using RDRAND
1.000 seconds for preparing test data, 67.09 MB/s
0.451 seconds for OpenCL compiling
0.061 seconds for data upload, 1104.31 MB/s
# sha1_16_test on 64 MB
0.031 seconds for OpenCL, 2161.66 MB/s
0.057 seconds for data download, 1180.31 MB/s
0.631 seconds for reference C(single thread), 106.35 MB/s
sha1_16_test: succeed
# aes_enc_128_test on 64 MB
0.532 seconds for OpenCL, 126.17 MB/s
0.048 seconds for data download, 1402.16 MB/s
0.251 seconds for reference C(single thread), 266.87 MB/s
aes_enc_128_test: succeed
# aes_dec_128_test on 64 MB
0.533 seconds for OpenCL, 125.80 MB/s
0.048 seconds for data download, 1400.84 MB/s
aes_dec_128_test: succeed
> bfcl info
2 platform(s) found:
=== 0x026d43c0 ===
name : Intel(R) OpenCL
vendor : Intel(R) Corporation
profile : FULL_PROFILE
version : OpenCL 1.2
2 device(s) found:
=== 0x02701e00 ===
name : Intel(R) HD Graphics 4400
vendor : Intel(R) Corporation
version : OpenCL 1.2
C version : OpenCL C 1.2
max compute units : 20
max work group size : 512
type : GPU
available : yes
compiler available : yes
endian : little
frequency : 1000
global memory : 1708759450
local memory : 65536
=== 0x026ed8c0 ===
name : Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
vendor : Intel(R) Corporation
version : OpenCL 1.2 (Build 10094)
C version : OpenCL C 1.2
max compute units : 4
max work group size : 8192
type : CPU
available : yes
compiler available : yes
endian : little
frequency : 1700
global memory : 4211548160
local memory : 32768
=== 0x0272f260 ===
name : NVIDIA CUDA
vendor : NVIDIA Corporation
profile : FULL_PROFILE
version : OpenCL 1.2 CUDA 9.0.125
1 device(s) found:
=== 0x0272f300 ===
name : GeForce 920M
vendor : NVIDIA Corporation
version : OpenCL 1.2 CUDA
C version : OpenCL C 1.2
max compute units : 2
max work group size : 1024
type : GPU
available : yes
compiler available : yes
endian : little
frequency : 954
global memory : 2147483648
local memory : 49152
of course: this is a laptop and the intel integrates is also available, but bfcl ignores it as it should.
bfcl msky ...............................
selected device GeForce 920M on platform NVIDIA CUDA
0.289 seconds for OpenCL compiling
local work size: 1024
got a hit: c27164f2e0994db82e3d14737dd5c901
36.93 seconds, 52.27 M/s
@Jimmy-Z - Is it possible if you could push the commits for the test build? It is greatly needed! Thanks!
Sorry for the delay, just committed the changes, @zoogie
Related:
https://github.com/zoogie/seedminer/issues/16 https://gbatemp.net/posts/7851408/ https://gbatemp.net/posts/7879961/ https://gbatemp.net/posts/7826386/ https://gbatemp.net/posts/7851868/ https://gbatemp.net/posts/7884725/ https://gbatemp.net/posts/7881547/
I'll need testers and reports, including these info:
your GPU model, GPU RAM size, OS version, driver version.
bfcl info
output does seedminer's GPU mode throw a "out of resources" error for you? (yes I also need successful reports) if the former is true, try the following build with two test commands, does it also say "out of resources"?Test build: bfCL-test-reduced-work-size-msky-lfcs-20.zip
Two test commands:
bfcl lfcs 00000007 0000 17f5c00d8b581e5e
bfcl msky c27164f2e0994db8000000007dd5c901 afcb0cc132bd2aeb8e0a6b6a841c51c0
Techy stuff
Despite what it looks like, this doesn't mean your GPU is not powerful/big enough, this program works on Intel IGPU and uses about several KB of GPU RAM, it's more like a OpenCL runtime bug from nvidia to me.
A reduced work size(from a little above 100,000,000 to 1,000,000) helped this guy with a GTX 980, so I guess this is the problem.
from OpenCL SDK document:
nvidia runtime announces GTX 980's address bits = 64, and 100,000,000 is no where near that.