clMathLibraries / clBLAS

a software library containing BLAS functions written in OpenCL
Apache License 2.0
839 stars 240 forks source link

Fix teardown #163

Closed hughperkins closed 8 years ago

hughperkins commented 8 years ago

Fixes issue https://github.com/clMathLibraries/clBLAS/issues/159

What is the issue?

What is the effect of the issue?

How does the fix work?

(Note that this also incorporates pull request for 'array initialize must be an initializer list', otherwise gemm doesnt get very far on my machine :-P )

hughperkins commented 8 years ago

Note: example output for the same sourecode as in https://github.com/clMathLibraries/clBLAS/issues/159#issuecomment-150896488 :

$ LD_LIBRARY_PATH=../build/library/ ./test
i=0
got platformids
got deviceids
created context
created commandqueue
setup blas ok
calling sgemm....

clblasSgemmEx result:
11 12 13 
21 35720 36680 
31 50720 52080 
41 65720 67480 
finished ok :-)
i=1
got platformids
got deviceids
created context
created commandqueue
setup blas ok
calling sgemm....

clblasSgemmEx result:
11 12 13 
21 35720 36680 
31 50720 52080 
41 65720 67480 
finished ok :-)
i=2
got platformids
got deviceids
created context
created commandqueue
setup blas ok
calling sgemm....

clblasSgemmEx result:
11 12 13 
21 35720 36680 
31 50720 52080 
41 65720 67480 
finished ok :-)
hughperkins commented 8 years ago

(Updated commit to call clReleaseKernel on the kernel first)

hughperkins commented 8 years ago

(I think this new commit has addressed also issue https://github.com/clMathLibraries/clBLAS/issues/166 , at least in combination with the fix identified in https://github.com/clMathLibraries/clBLAS/issues/167 )

guacamoleo commented 8 years ago

Hugh's concern and fix are both legit. I checked out his commit and validated that it passes the gemm tests.

hughperkins commented 8 years ago

Thanks!

hughperkins commented 8 years ago

Hi. Just realized that this fails to build the first time, since include/AutoGemmIncludes/AutoGemmClKernels.cpp isnt present. And init.c is compiled before the generation step.