clMathLibraries / clBLAS

a software library containing BLAS functions written in OpenCL
Apache License 2.0
843 stars 237 forks source link

OpenCL error -11 during test-short (clBLAS/src/library/blas/xtrsm.cc) #212

Closed nathan-sixnines closed 8 years ago

nathan-sixnines commented 8 years ago

I am having trouble running test-short.

This is the output from my build:

... nathan@amdrig6:~/clBLAS/src/build/library$ ./test-short Initialize OpenCL and clblas... ---- Advanced Micro Devices, Inc. SetUp: about to create command queues

Test environment:

Device name: Tahiti Device vendor: Advanced Micro Devices, Inc. Platform (bit): Linux clblas version: 2.10.0 Driver version: 1800.8 (VM) Device version: OpenCL 1.2 AMD-APP (1800.8)

Global mem size: 3036 MB

[==========] Running 10096 tests from 125 test cases. [----------] Global test environment set-up. [----------] 4 tests from TRSM_extratest [ RUN ] TRSM_extratest.strsm Calling reference xTRSM routine... Calling clblas xTRSM routine... Done [ OK ] TRSM_extratest.strsm (468 ms) [ RUN ] TRSMextratest.dtrsm Calling reference xTRSM routine... Calling clblas xTRSM routine... OpenCL error -11 on line 228 test-short: /home/nathan/clBLAS/src/library/blas/xtrsm.cc:228: void makeKernel( cl_kernel, cl_commandqueue, const char, const char_, const unsigned char, sizet, const char_): Assertion `false' failed. Aborted (core dumped) nathan@amdrig6:~/clBLAS/src/build/library$ ...

After this I decided to give the pre-compiled binaries a try. I used the 2.8.0 binaries for Linux This is the result from that:

... nathan@amdrig6://opt/clBLAS-2.8.0-Linux-x64/bin$ ./test-short Initialize OpenCL and clblas... ---- Advanced Micro Devices, Inc. SetUp: about to create command queues

Test environment:

Device name: Tahiti Device vendor: Advanced Micro Devices, Inc. Platform (bit): Linux clblas version: 2.8.0 Driver version: 1800.8 (VM) Device version: OpenCL 1.2 AMD-APP (1800.8)

Global mem size: 3036 MB

[==========] Running 10096 tests from 125 test cases. [----------] Global test environment set-up. [----------] 4 tests from TRSM_extratest [ RUN ] TRSM_extratest.strsm Calling reference xTRSM routine... Calling clblas xTRSM routine... Done [ OK ] TRSM_extratest.strsm (296 ms) [ RUN ] TRSM_extratest.dtrsm Calling reference xTRSM routine... Calling clblas xTRSM routine... === Build log === Error: aclBinary init failure

OpenCL error -11 on line 187 Segmentation fault (core dumped) nathan@amdrig6://opt/clBLAS-2.8.0-Linux-x64/bin$

I'm sort of crossing my fingers here hoping it's not some issue with the fglrx drivers because it took me a long time just to find a driver version that could reliably perform clinfo.

dpkg -l fglrx fglrx-core fglrx-dev fglrx-amdcccle:

nathan@amdrig6://opt/clBLAS-2.8.0-Linux-x64/bin$ dpkg -l fglrx fglrx-core fglrx-dev fglrx-amdcccle Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-===============-============-============-==================================== rc fglrx 2:15.200-0ub amd64 Video driver for the AMD graphics ac un fglrx-amdcccle (no description available) rc fglrx-core 2:14.501-0ub amd64 Minimal video driver for the AMD gra un fglrx-dev (no description available)

nathan-sixnines commented 8 years ago

Maybe this is actually similar to https://github.com/clMathLibraries/clBLAS/issues/207

This is the output from ./test functional and sudo ./test functional:

nathan@amdrig6://home/nathan/clBLAS/src/tests/staging$ ./test-functional Initialize OpenCL and clblas... ---- Advanced Micro Devices, Inc. SetUp: about to create command queues [==========] Running 715 tests from 5 test cases. [----------] Global test environment set-up. [----------] 203 tests from ERROR [ RUN ] ERROR.InvalidCommandQueue OpenCL error -36 on line 392 of /home/nathan/clBLAS/src/library/blas/xgemm.cc test-functional: /home/nathan/clBLAS/src/library/blas/xgemm.cc:392: clblasStatus >clblasGemm(clblasOrder, clblasTranspose, clblasTranspose, size_t, size_t, size_t, Precision, >cl_mem, size_t, size_t, cl_mem, size_t, size_t, Precision, cl_mem, size_t, size_t, cl_uint, >_cl_command_queue, cl_uint, _cl_event* const*, _cl_event) [with Precision = float; clblasStatus >= clblasStatus; clblasOrder = clblasOrder; clblasTranspose = clblasTranspose_; size_t = long >unsigned int; cl_mem = _clmem; cl_uint = unsigned int; cl_command_queue = >_cl_commandqueue; cl_event = _clevent]: Assertion false' failed. Aborted (core dumped) nathan@amdrig6://home/nathan/clBLAS/src/tests/staging$ sudo ./test-functional Initialize OpenCL and clblas... ---- Advanced Micro Devices, Inc. SetUp: about to create command queues [==========] Running 715 tests from 5 test cases. [----------] Global test environment set-up. [----------] 203 tests from ERROR [ RUN ] ERROR.InvalidCommandQueue OpenCL error -36 on line 392 of /home/nathan/clBLAS/src/library/blas/xgemm.cc test-functional: /home/nathan/clBLAS/src/library/blas/xgemm.cc:392: clblasStatus >clblasGemm(clblasOrder, clblasTranspose, clblasTranspose, size_t, size_t, size_t, Precision, >cl_mem, size_t, size_t, cl_mem, size_t, size_t, Precision, cl_mem, size_t, size_t, cl_uint, >_cl_command_queue__, cl_uint, _cl_event_ const_, _cl_event_*) [with Precision = float; clblasStatus >= clblasStatus_; clblasOrder = clblasOrder_; clblasTranspose = clblasTranspose_; size_t = long >unsigned int; cl_mem = _cl_mem_; cl_uint = unsigned int; cl_command_queue = >_cl_command_queue_; cl_event = _cl_event*]: Assertionfalse' failed. nathan@amdrig6://home/nathan/clBLAS/src/tests/staging$

mpekalski commented 8 years ago

Regarding clinfo, what do you mean by reliably?

I have a bit different GPU, but when I installed the latest driver (15.12) and AMD APP SDK the clinfo did not work because it was using libamdocl12cl64.so from the SDK instead of the one provided with a driver. Renaming the one in SDK folder solved the issue.

nathan-sixnines commented 8 years ago

With other versions of the flgrx driver, I was getting segmentation faults when I tried to run openCL programs, and clinfo would work the first time I ran it after a reboot, but all subsequent times it would segmentation fault, until a reboot.

The understanding that I came to was the drivers from the AMD page originally worked on Ubuntu 14.04 but at some point an update the kernel broke the drivers for many users.

The version of the driver in the apt-get repository for ubuntu is 15.200, which if I understand correctly is a higher version number than 15.12, so I think that the ubuntu community have developed a new version of the driver that would work with their new kernel.

So, once I started using that driver from the ubuntu repository, I could run openCL programs and use clinfo without a segmentation fault.

I'm not really sure if my understanding of what happened is correct or not, I'm sort of trying to figure out the situation with the AMD drivers and Linux myself.

For example, I have linux 14.04 and the AMD page says that they have tested my card, but I don't think they did that testing with the 3.19 kernel, which is what I have.

mpekalski commented 8 years ago

From what I read, I do not remember where, after upgrading kernel you need to reinstall flgrx every single time.

Regarding drivers from apt-get, that would be open sourced, I am talking about proprietary drivers from AMD's website. For the 15.12 from the website if you run dpkg -l fglrx fglrx-core fglrx-dev fglrx-amdcccle the output actually shows

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                      Version           Architecture      Description
+++-=========================-=================-=================-========================================================
ii  fglrx                     2:15.302-0ubuntu1 amd64             Video driver for the AMD graphics accelerators
ii  fglrx-amdcccle            2:15.302-0ubuntu1 amd64             Catalyst Control Center for the AMD graphics accelerator
ii  fglrx-core                2:15.302-0ubuntu1 amd64             Minimal video driver for the AMD graphics accelerators
ii  fglrx-dev                 2:15.302-0ubuntu1 amd64             Video driver for the AMD graphics accelerators (devel fi

and

clinfo shows 1912.5

Number of platforms:                 1
  Platform Profile:              FULL_PROFILE
  Platform Version:              OpenCL 2.0 AMD-APP (1912.5)
  Platform Name:                 AMD Accelerated Parallel Processing
  Platform Vendor:               Advanced Micro Devices, Inc.
  Platform Extensions:               cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 

I would suggest trying those from the website and renaming libamdocl12cl62.so so it does not get picked up by clinfo or anything else.

sudo mv /opt/AMDAPPSDK-3.0/lib/x86_64/libamdocl12cl64.so /opt/AMDAPPSDK-3.0/lib/x86_64/libamdocl12cl64.so_old
TimmyLiu commented 8 years ago

Hi, PR #214 should fix the test-function fails. Can you try it out?

nathan-sixnines commented 8 years ago

Ok. Now I'm trying it with this driver: http://support.amd.com/en-us/download/desktop?os=Linux+x86_64 . which is also the 15.302 driver.

At first this had the same problem I was having before with clinfo segmentation faulting, but renaming libamdoc1112cl64.so fixed that.

A couple basic cl examples in the AMDAPPSDK work, like helloworld and basicdebug work, but the advanced-convolution sample still segmentation faults. Additionally in the cpp_cl section the FFT segmentation faults.

I don't know if I tested the advanced-convolution with 15.200 but I remember testing the FFT and that worked before.

Given that the cpp_cl FFT is segmentation faulting, it seems very unlikely that it would pass any of the clBLAS tests, but I tried building the new develop branch anyway. Curiously, after making the branch the staging folder only has the clBLAS-tune make-ktest t_blkmul t_dblock_kgen t_gens_cache and the t_tilemul, and none of the other tests, even though the BUILD_TEST is set to on in the CMakeLists.

Anyway, I don't think I'm likely to get anywhere with clBLAS as long as I'm having trouble just getting cl to work. Not sure if I should go back to 15.200 or try something else with the AMD drivers.

mpekalski commented 8 years ago

I would suggest you first try building and testing clRNG (https://github.com/clMathLibraries/clRNG) and clFFT (https://github.com/clMathLibraries/clFFT) as they do not have so many dependencies as clBLAS.

nathan-sixnines commented 8 years ago

I could have sworn that I saw it clear the tests completely with this version but I seem to be running into a failure on THREAD.dgemm pretty consistently now, so I'm not so sure.

[ RUN ] THREAD.dgemm m : 0 n: 0 /home/nathan/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure The difference between a and b is 100999564502815, which exceeds delta, where a evaluates to 100999564185861, b evaluates to -316954, and delta evaluates to 0. m : 0 n: 0 /home/nathan/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure The difference between a and b is 7271968644202680, which exceeds delta, where a evaluates to 100999564185861, b evaluates to -7170969080016819, and delta evaluates to 0. [ FAILED ] THREAD.dgemm (1031 ms)

Right now that seems to be the only test-functional test that is failing.

TimmyLiu commented 8 years ago

@aeium and @mpekalski Are you still experiencing test-functional fails? I ran test-functional and they all seem to pass.

mpekalski commented 8 years ago

I have just cloned the develop rep and build clBLAS, test-functional still fails.

Twice I get information about invalid size of X

[ RUN      ] ERROR.InvalidMemObjectnrm2
Invalid Size for X
[       OK ] ERROR.InvalidMemObjectnrm2 (3 ms)
[ RUN      ] ERROR.InvalidValuenrm2
Invalid Size for X
[       OK ] ERROR.InvalidValuenrm2 (5 ms)

Further as I reported in the last comment in https://github.com/clMathLibraries/clBLAS/issues/207

[----------] 120 tests from THREAD
[ RUN      ] THREAD.sgemm
[       OK ] THREAD.sgemm (1007 ms)
[ RUN      ] THREAD.cgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 14936045, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to 14772535, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
[  FAILED  ] THREAD.cgemm (1010 ms)
[ RUN      ] THREAD.dgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 7271968644202680, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -7170969080016819, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 100999564502815, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -316954, and
delta evaluates to 0.
[  FAILED  ] THREAD.dgemm (1006 ms)
[ RUN      ] THREAD.zgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 1.2340888881405624e+18, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -1.234107304510143e+18, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
[  FAILED  ] THREAD.zgemm (1047 ms)
[ RUN      ] THREAD.strmm
[       OK ] THREAD.strmm (1006 ms)
[ RUN      ] THREAD.ctrmm
[       OK ] THREAD.ctrmm (1010 ms)
[ RUN      ] THREAD.dtrmm
[       OK ] THREAD.dtrmm (1008 ms)
[ RUN      ] THREAD.ztrmm
[       OK ] THREAD.ztrmm (1054 ms)
[ RUN      ] THREAD.strsm
[       OK ] THREAD.strsm (1007 ms)
[ RUN      ] THREAD.ctrsm
[       OK ] THREAD.ctrsm (1010 ms)
[ RUN      ] THREAD.dtrsm
OpenCL error -52 on line 1038
test-functional: /home/marcin/Downloads/clBLAS/src/library/blas/xtrsm.cc:1038: cl_int diag_dtrtri128(cl_command_queue, int, clblasUplo, clblasDiag, cl_mem, size_t, cl_mem, size_t, int, int, _cl_event**): Assertion `false' failed.
Aborted (core dumped)

After I made a smilliar change in src/library/blas/xtrsm.cc (https://github.com/mpekalski/clBLAS/pull/1/files) to what you have done in https://github.com/clMathLibraries/clBLAS/pull/214 it fails, but at least does not crash. And I still get information about invalid size of X

[----------] 120 tests from THREAD
[ RUN      ] THREAD.sgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 90788, which exceeds delta, where
a evaluates to -90792,
b evaluates to -4, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 6536736, which exceeds delta, where
a evaluates to -90792,
b evaluates to 6445944, and
delta evaluates to 0.
[  FAILED  ] THREAD.sgemm (1012 ms)
[ RUN      ] THREAD.cgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 14936045, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to 14772535, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 14936045, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to 14772535, and
delta evaluates to 0.
[  FAILED  ] THREAD.cgemm (1022 ms)
[ RUN      ] THREAD.dgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 7271968644202680, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -7170969080016819, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 100999564502815, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -316954, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 100999564502815, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -316954, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:327: Failure
The difference between a and b is 7271968644202680, which exceeds delta, where
a evaluates to 100999564185861,
b evaluates to -7170969080016819, and
delta evaluates to 0.
[  FAILED  ] THREAD.dgemm (1014 ms)
[ RUN      ] THREAD.zgemm
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 10702371064342262, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to 10683954694761606, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
m : 0    n: 0
/home/marcin/Downloads/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 1.2340888881405624e+18, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -1.234107304510143e+18, and
delta evaluates to 0.
[  FAILED  ] THREAD.zgemm (1086 ms)

It passes all the other tests:

[----------] Global test environment tear-down
[==========] 715 tests from 5 test cases ran. (235479 ms total)
[  PASSED  ] 711 tests.
[  FAILED  ] 4 tests, listed below:
[  FAILED  ] THREAD.sgemm
[  FAILED  ] THREAD.cgemm
[  FAILED  ] THREAD.dgemm
[  FAILED  ] THREAD.zgemm
Gijom commented 8 years ago

Hello, just to mention that I have very similar problems with the current dev branch:

./test-functional --gtest_filter=THREAD.zgemm
Initialize OpenCL and clblas...
---- Advanced Micro Devices, Inc.
SetUp: about to create command queues
Note: Google Test filter = THREAD.zgemm
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from THREAD
[ RUN      ] THREAD.zgemm
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 10702371064342262, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to 10683954694761606, and
delta evaluates to 0.
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:472: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 18416368935676, which exceeds delta, where
((a).s[0]) evaluates to -18416369580656,
((b).s[0]) evaluates to -644980, and
delta evaluates to 0.
[  FAILED  ] THREAD.zgemm (1066 ms)
[----------] 1 test from THREAD (1066 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1067 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] THREAD.zgemm

 1 FAILED TEST

And also for THREAD.cgemm

./test-functional --gtest_filter=THREAD.cgemm
Initialize OpenCL and clblas...
---- Advanced Micro Devices, Inc.
SetUp: about to create command queues
Note: Google Test filter = THREAD.cgemm
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from THREAD
[ RUN      ] THREAD.cgemm
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 71204732598, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to 71204569088, and
delta evaluates to 0.
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
m : 0    n: 0
/home/chanel/src/extern/clBLAS/src/tests/include/matrix.h:397: Failure
The difference between ((a).s[0]) and ((b).s[0]) is 163485, which exceeds delta, where
((a).s[0]) evaluates to -163510,
((b).s[0]) evaluates to -25, and
delta evaluates to 0.
[  FAILED  ] THREAD.cgemm (1058 ms)
[----------] 1 test from THREAD (1058 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1058 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] THREAD.cgemm

 1 FAILED TEST

However this is not always the case and these errors occurs approximately one time over 2.

Finally the test-short and test-correctness functions end-up by a SIGSEGV in similar situations:

.
.
Here some similar matrix.h warnings than those above and some failed tests (much more successful tests)
.
.
>> Test is skipped because it has no importance for this level of coverage
[       OK ] ColumnMajor_SmallRange/GEMM.sgemm/70 (0 ms)
[ RUN      ] ColumnMajor_SmallRange/GEMM.sgemm/71
             seed = 12345, queues = 1, clblasColumnMajor, clblasConjTrans, clblasConjTrans, M = 128, N = 128, K = 128, offA = 0, offB = 0, offC = 0, lda = 128, ldb = 128, ldc = 128
>> Test is skipped because it has no importance for this level of coverage
[       OK ] ColumnMajor_SmallRange/GEMM.sgemm/71 (0 ms)
[ RUN      ] ColumnMajor_SmallRange/GEMM.dgemm/0
             seed = 12345, queues = 1, clblasColumnMajor, clblasNoTrans, clblasNoTrans, M = 63, N = 63, K = 63, offA = 0, offB = 0, offC = 0, lda = 63, ldb = 63, ldc = 63
[1]    13677 segmentation fault (core dumped)  LD_LIBRARY_PATH=/opt/acml5.3.1/ifort64/lib:/opt/clBLAS/lib64 ./test-short
guacamoleo commented 8 years ago

We have just merged in a fix to the develop branch which should fix all GEMM thread safety issues; please test and re-issue bug if not resolved.

Gijom commented 8 years ago

Thanks THREAD related issues are indeed solved but the "matrix.h" issue + SEGV remains, I will open a new bug report.