Closed devurandom closed 9 years ago
Is this an OpenCL 1.1 driver (as opposed to 1.2)?
@sayantan this is just a dry-run guess of mine
diff --git a/src/opencl_rawmd4_fmt_plug.c b/src/opencl_rawmd4_fmt_plug.c
index 4d8e6e6..576ce37 100644
--- a/src/opencl_rawmd4_fmt_plug.c
+++ b/src/opencl_rawmd4_fmt_plug.c
@@ -736,11 +736,13 @@ static int crypt_all(int *pcount, struct db_salt *salt)
if (!is_static_gpu_mask)
HANDLE_CLERROR(clEnqueueWriteBuffer(queue[gpu_id], buffer_int_key_loc, CL_TRUE, 0, 4 * global_work_size, saved_int_key_loc, 0, NULL, multi_profilingEvent[2]), "failed in clEnqueueWriteBuffer buffer_int_key_loc.");
+#if CL_VERSION_1_2
if (ocl_ver >= 120) {
zero = 0;
HANDLE_CLERROR(clEnqueueFillBuffer(queue[gpu_id], buffer_hash_ids, &zero, sizeof(cl_uint), 0, sizeof(cl_uint), 0, NULL, multi_profilingEvent[3]), "failed in clEnqueueFillBuffer buffer_hash_ids.");
HANDLE_CLERROR(clEnqueueFillBuffer(queue[gpu_id], buffer_bitmap_dupe, &zero, sizeof(cl_uint), 0, sizeof(cl_uint) * (hash_table_size/32 + 1), 0, NULL, multi_profilingEvent[4]), "failed in clEnqueueFillBuffer buffer_bitmap_dupe.");
}
+#endif
if (salt != NULL && salt->count > 100 &&
(num_loaded_hashes - num_loaded_hashes / 10) > salt->count) {
Similar fix for all three.
@magnumripper Yes, as far I know, Mesa supports OpenCL 1.1 only. See e.g. the Gallium Compute feature matrix or various posts on Phoronix.
Hopefully fixed in 59c5b42, please try it out and report back.
On a side note, the __OPENCL_VERSION__
macro can not be used (I presume it's only available in device code). But CL_VERSION_1_2
works fine.
Sorry, still does not work, as Mesa advertises OpenCL 1.2 in its headers, declares the function, but does not define it:
# grep -B1 CL_VERSION /usr/include/CL/cl.h
/* OpenCL Version */
#define CL_VERSION_1_0 1
#define CL_VERSION_1_1 1
#define CL_VERSION_1_2 1
# grep -B1 -A8 clEnqueueFillBuffer /usr/include/CL/cl.h
extern CL_API_ENTRY cl_int CL_API_CALL
clEnqueueFillBuffer(cl_command_queue /* command_queue */,
cl_mem /* buffer */,
const void * /* pattern */,
size_t /* pattern_size */,
size_t /* offset */,
size_t /* size */,
cl_uint /* num_events_in_wait_list */,
const cl_event * /* event_wait_list */,
cl_event * /* event */) CL_API_SUFFIX__VERSION_1_2;
# scanelf -s clEnqueueFillBuffer /usr/lib/libOpenCL.so
TYPE SYM FILE
ET_DYN - /usr/lib/libOpenCL.so
Does OpenCL define a way to check the implementation vendor or version at compile time? Maybe that could be used to hack-around this issue?
Does OpenCL define a way to check the implementation vendor or version at compile time?
Not that I know of. You should try to find some tell-tale macro or something.
I reported a bug against Mesa: https://bugs.freedesktop.org/show_bug.cgi?id=91130
Try gcc -dM -E -x c /usr/include/CL/cl.h | grep -i mesa
(and try replacing 'mesa' with some other guesses) and see if there is some macro (that is not present in gcc -dM -E -x c /dev/null
) we can use.
As a last resort we could test this function's existence in the configure script.
$ gcc -dM -E -x c /usr/include/CL/cl.h | grep -i vendor
#define CL_PLATFORM_VENDOR 0x0903
#define CL_DEVICE_VENDOR 0x102C
#define CL_DEVICE_VENDOR_ID 0x1001
$ gcc -dM -E -x c /usr/include/CL/cl.h | grep -i version
#define CL_EXT_SUFFIX__VERSION_1_0_DEPRECATED __attribute__((deprecated))
#define CL_PLATFORM_VERSION 0x0901
#define __GXX_ABI_VERSION 1002
#define CL_DRIVER_VERSION 0x102D
#define __VERSION__ "4.9.2"
#define CL_EXT_PREFIX__VERSION_1_0_DEPRECATED
#define CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED __attribute__((deprecated))
#define CL_VERSION_1_0 1
#define CL_VERSION_1_1 1
#define CL_VERSION_1_2 1
#define CL_DEVICE_OPENCL_C_VERSION 0x103D
#define CL_DEVICE_VERSION 0x102F
#define CL_EXT_SUFFIX__VERSION_1_0
#define CL_EXT_SUFFIX__VERSION_1_1
#define CL_EXT_SUFFIX__VERSION_1_2
#define CL_API_SUFFIX__VERSION_1_0
#define CL_API_SUFFIX__VERSION_1_1
#define CL_API_SUFFIX__VERSION_1_2
#define CL_EXT_PREFIX__VERSION_1_1_DEPRECATED
$ gcc -dM -E -x c /usr/include/CL/cl.h | grep -i driver
#define CL_DRIVER_VERSION 0x102D
$ gcc -dM -E -x c /usr/include/CL/cl.h | grep -i platform
#define CL_PLATFORM_VENDOR 0x0903
#define CL_CONTEXT_PLATFORM 0x1084
#define CL_INVALID_PLATFORM -32
#define CL_PLATFORM_VERSION 0x0901
#define CL_DEVICE_PLATFORM 0x1031
#define CL_PLATFORM_NAME 0x0902
#define CL_PLATFORM_PROFILE 0x0900
#define __CL_PLATFORM_H
#define CL_PLATFORM_EXTENSIONS 0x0904
Unfortunately I get these on OSX Yosemite as well. I think Mesa just copied Khronos' headers (which is sensible but they should have used 1.1 instead of 1.2)
$ gcc -dM -E -x c /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/OpenCL.framework/Versions/A/Headers/cl.h | grep -i vendor
#define CL_PLATFORM_VENDOR 0x0903
#define CL_DEVICE_VENDOR 0x102C
#define CL_DEVICE_VENDOR_ID 0x1001
$ gcc -dM -E -x c /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/OpenCL.framework/Versions/A/Headers/cl.h | grep -i driver
#define CL_DRIVER_VERSION 0x102D
$ gcc -dM -E -x c /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/OpenCL.framework/Versions/A/Headers/cl.h | grep -i device_version
#define CL_DEVICE_VERSION 0x102F
$ gcc -dM -E -x c /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/OpenCL.framework/Versions/A/Headers/cl.h | grep -i platform
#define CL_PLATFORM_VENDOR 0x0903
#define CL_CONTEXT_PLATFORM 0x1084
#define CL_PLATFORM_PROFILE 0x0900
#define CL_PLATFORM_VERSION 0x0901
#define CL_DEVICE_PLATFORM 0x1031
#define CL_PLATFORM_NAME 0x0902
#define CL_INVALID_PLATFORM -32
#define __CL_PLATFORM_H
#define CL_PLATFORM_EXTENSIONS 0x0904
Did you change something related to this recently?
In any case: d8cb9ce98acd26bd917dff4cbb54bdc14b7133f9 compiles with Mesa 10.6.1.
I tested using john --test=0 --verbosity=5
, but that managed to crash X (and JTR): https://bugs.freedesktop.org/show_bug.cgi?id=91305 (which is not related to this bug, just FYI).
Yes I believe Sayantan stopped using those OpenCL 1.2 functions for now. So I guess we're good for now? Please report other problems as separate issues. Did you see what format was running when all crashed?
Did you see what format was running when all crashed?
Sure, it's all in the logfiles attached to the bug:
Testing: sha512crypt-opencl, crypt(3) $6$ (rounds=5000) [SHA512 OpenCL]... Local worksize (LWS) 7, global worksize (GWS) 49
FAILED (cmp_all(1))
Options used: -I /usr/share/john/kernels -cl-mad-enable -D__GPU__ -DDEVICE_INFO=10 -DDEV_VER_MAJOR=10 -DDEV_VER_MINOR=6 -D_OPENCL_COMPILER
Build log: input.cl:95:43: warning: unknown attribute 'max_constant_size' ignored
input.cl:99:43: warning: unknown attribute 'max_constant_size' ignored
Options used: -I /usr/share/john/kernels -cl-mad-enable -D__GPU__ -DDEVICE_INFO=10 -DDEV_VER_MAJOR=10 -DDEV_VER_MINOR=6 -D_OPENCL_COMPILER
Local worksize (LWS) 8, Global worksize (GWS) 56
Testing: descrypt-opencl, traditional crypt(3) [DES OpenCL]... radeon: Failed to allocate virtual address for buffer:
radeon: size : 4 bytes
radeon: alignment : 4096 bytes
radeon: domains : 2
radeon: va : 0x00000000023d6000
radeon: Failed to allocate virtual address for buffer:
radeon: size : 4 bytes
radeon: alignment : 4096 bytes
radeon: domains : 2
radeon: va : 0x0000000001fcb000
FAILED (cmp_one(0))
radeon: Failed to allocate virtual address for buffer:
radeon: size : 1 bytes
radeon: alignment : 1 bytes
radeon: domains : 2
radeon: va : 0x0000000001fcb000
radeon: Failed to allocate virtual address for buffer:
radeon: size : 1 bytes
radeon: alignment : 1 bytes
radeon: domains : 2
radeon: va : 0x0000000000858000
Segmentation fault
Ahaa, @Sayantan2048 this may be another problem with goto in descrypt-opencl (just a guess though).
@devurandom you could try this change and see if it makes it safer
diff --git a/src/opencl_DES_kernel_params.h b/src/opencl_DES_kernel_params.h
index 1acd8b9..d1586e8 100644
--- a/src/opencl_DES_kernel_params.h
+++ b/src/opencl_DES_kernel_params.h
@@ -14,14 +14,7 @@ typedef unsigned WORD vtype;
* OSX' Intel HD4000 driver [1.2(Sep25 2014 22:26:04)] fails building the
* "fast goto" version.
*/
-#if nvidia_sm_5x(DEVICE_INFO) || gpu_intel(DEVICE_INFO) || \
- (gpu_amd(DEVICE_INFO) && DEV_VER_MAJOR >= 1573 && !defined(__Tahiti__)) || \
- (gpu_amd(DEVICE_INFO) && DEV_VER_MAJOR >= 1702)
-//#warning Using 'safe goto' kernel
#define SAFE_GOTO
-#else
-//#warning Using 'fast goto' kernel
-#endif
#if no_byte_addressable(DEVICE_INFO)
#define RV7xx
There is a formatting issue on the first line removed: "||
Now it applies and I'll report back with the results shortly.
Your patch gets me a bit further (running /usr/sbin/john --test=0 --verbosity=5
as a regular non-root user):
Testing: mscash-opencl, M$ Cache Hash [MD4 OpenCL]... Options used: -I /usr/share/john/kernels -cl-mad-enable -D__GPU__ -DDEVICE_INFO=10 -DDEV_VER_MAJOR=10 -DDEV_VER_MINOR=6 -D_OPENCL_COMPILER -D NUM_INT_KEYS=1 -D IS_STATIC_GPU_MASK=0 -D CONST_CACHE_SIZE=268435456 -D LOC_0=-1 -D LOC_1=-1 -D LOC_2=-1 -D LOC_3=-1
Build log: input.cl:365:18: warning: unknown attribute 'max_constant_size' ignored
input.cl:375:18: warning: unknown attribute 'max_constant_size' ignored
Self test GWS: 64, LWS: 8
FAILED (cmp_all(0))
Options used: -I /usr/share/john/kernels -cl-mad-enable -D__GPU__ -DDEVICE_INFO=10 -DDEV_VER_MAJOR=10 -DDEV_VER_MINOR=6 -D_OPENCL_COMPILER -D SALT_BUFFER_SIZE=260
Build log: input.cl:586:17: warning: unknown attribute 'max_constant_size' ignored
binary size 107038
Error creating binary file $JOHN/kernels/pbkdf2_kernel_-D_SALT_BUFFER_SIZE=260_-DDEV_VER_MAJOR=10_-DDEV_VER_MINOR=6AMD_KAVERI_0.bin
Device 0 GWS: 8192, LWS: 8
Testing: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]... radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
radeon: Failed to allocate virtual address for buffer:
radeon: size : 131072 bytes
radeon: alignment : 4096 bytes
radeon: domains : 2
radeon: va : 0x00000000020b2000
radeon: Failed to allocate virtual address for buffer:
radeon: size : 131072 bytes
radeon: alignment : 4096 bytes
radeon: domains : 2
radeon: va : 0x00000000020b2000
Bus error
At this point X crashed again. After this, without rebooting, john will immediately segfault:
$ /usr/sbin/john --test=0 --verbosity=5
radeon: Failed to allocate virtual address for buffer:
radeon: size : 4352 bytes
radeon: alignment : 4096 bytes
radeon: domains : 4
radeon: va : 0x0000000000800000
radeon: Failed to allocate virtual address for buffer:
radeon: size : 4352 bytes
radeon: alignment : 4096 bytes
radeon: domains : 4
radeon: va : 0x0000000000800000
Segmentation fault
For the full details, please see https://bugs.freedesktop.org/show_bug.cgi?id=91315.
Thanks. We should apply a (proper) patch for descrypt to start with. Would you please post the output from --list=opencl-devices
on this system?
I'll report back next week, when I have access to the machine again. One info already, though: I recently updated to Mesa 10.6.1.
It's mscash2-opencl crashing now. Also by @Sayantan2048 (which doesn't imply he's code is bad, it's probably just that he's better at squeezing out performance, which is more prone to reveal driver bugs).
# john --list=opencl-devices
Invalid MIT-MAGIC-COOKIE-1 keyInvalid MIT-MAGIC-COOKIE-1 keyInvalid MIT-MAGIC-COOKIE-1 keyInvalid MIT-MAGIC-COOKIE-1 keyPlatform #0 name: Clover
Platform version: OpenCL 1.1 MESA 10.6.1
Device #0 (0) name: AMD KAVERI
Device vendor: AMD
Device type: GPU (LE)
Device version: OpenCL 1.1 MESA 10.6.1
Driver version: 10.6.1 - Catalyst
Native vector widths: char 16, short 8, int 4, long 2
Preferred vector width: char 16, short 8, int 4, long 2
Global Memory: 1024.0 MB
Local Memory: 32.0 KB (Local)
Max memory alloc. size: 256.2 MB
Max clock (MHz): 720
Max Work Group Size: 256
Parallel compute cores: 8
Stream processors: 640 (8 x 80)
Device #1 (1) name: AMD REDWOOD
Device vendor: AMD
Device type: GPU (LE)
Device version: OpenCL 1.1 MESA 10.6.1
Driver version: 10.6.1 - Catalyst
Native vector widths: char 16, short 8, int 4, long 2
Preferred vector width: char 16, short 8, int 4, long 2
Global Memory: 1024.0 MB
Local Memory: 32.0 KB (Local)
Max memory alloc. size: 256.2 MB
Max clock (MHz): 700
Max Work Group Size: 256
Parallel compute cores: 5
Stream processors: 400 (5 x 80)
PCI device topology:
Great, we can look for "MESA" in the device version string. I will look at fixing descrypt.
descrypt should be fixed in 8b53844, and we now have our own __MESA__
macro to use for Mesa-specific workarounds. Please confirm that latest source runs descrypt fine as-is.
BTW other changes happened to many formats too, you might want to re-test everything.
New issue:
opencl_lm_b_plug.o: In function `set_kernel_args_gws':
opencl_lm_b_plug.c:(.text+0x15ea): undefined reference to `clGetKernelArgInfo'
opencl_lm_b_plug.o: In function `set_kernel_args.isra.1':
opencl_lm_b_plug.c:(.text+0x1dc5): undefined reference to `clGetKernelArgInfo'
opencl_lm_b_plug.c:(.text+0x1e4a): undefined reference to `clGetKernelArgInfo'
collect2: error: ld returned 1 exit status
This is again caused by Mesa incorrectly defining CL_VERSION_1_2
Please retry with latest commit. We now check the actual library for OpenCL 1.2 and don't just trust the headers. Mesa need to correct this though, there will be many problems - just not with JtR Jumbo.
Sorry, I cannot do anything about it. Mesa did not react to my bugreport yet.
In his comment on upstream bug #91130 Serge Martin mentions that a patch series exists to fix the issue upstream.
If I understand him right, we'd be OK with that even if they are stubs. As long as we can link a working JtR binary, we can do the right things at run-time.
But like I said, we're already fine now (or should be) with the extra check in autoconf. Mesa should fix it to avoid problems with other programs.
After 37c88d9 there should be an LM-opencl format testable with Mesa.
(xz-compressed build.log: The .png extension is just a disguise to fool GitHub - rename the file to build.log.xz after downloading it.)