Closed matejaputic closed 8 years ago
the gemm kernel is not generated in the run time.
the source code is in src/library/blas/AutoGemm. the source code is generated by python script.
Most of other kernels (e.g. gemv) are generated on the fly
On Thu, Feb 25, 2016 at 11:55 AM, Mateja Putic notifications@github.com wrote:
I'd like to configure clBLAS to dump OpenCL kernels as they are generated. I set BLAS_DUMP_CLBLAS_KERNELS and BLAS_KEEP_KERNEL_SOURCES in CMakeCache.txt, in addition to BUILD_SAMPLE, and I did a build.
Then I ran
LD_LIBRARY_PATH=/path/to/libclBLAS.so.2 ./samples/example_sgemm
to make sure it's referencing the library that was just built. However, I don't see any output files.
Searching through the codebase, It seems that the function that is supposed to dump kernels is the kernelDump https://github.com/clMathLibraries/clBLAS/blob/9731ea2a270509211a47bf6cf9df4de2069ccc52/src/library/blas/generic/kdump.c#L120 function, which gets called from the enqueueKernel https://github.com/clMathLibraries/clBLAS/blob/9731ea2a270509211a47bf6cf9df4de2069ccc52/src/library/blas/generic/solution_seq.c#L114 function. However, I don't see where the enqueueKernel function gets called.
As an additional step, I tried running samples/example_sgemm with gdb, and I set a breakpoint on enqueueKernel and dumpKernel but they were never hit.
Is there anything I'm missing to enable dumping of the cl kernels?
— Reply to this email directly or view it on GitHub https://github.com/clMathLibraries/clBLAS/issues/232.
Tingxing dong
Thank you for your reply. How can I use the Python script to generate the gemm kernel?
No. you do not have to do it your own.
read that folder, you will see what is going on.
On Thu, Feb 25, 2016 at 12:10 PM, Mateja Putic notifications@github.com wrote:
Thank you for your reply. How can I use the Python script to generate the gemm kernel?
— Reply to this email directly or view it on GitHub https://github.com/clMathLibraries/clBLAS/issues/232#issuecomment-188909926 .
Tingxing dong
I see that there are many kernels already generated in UserGemmKernelSources
. How are these kernels selected by the runtime based on the dimensions of the workload?
For example, if I were to call clblasSgemm
with N = M = K = 1024, which one of these kernels would be compiled? How does clBLAS make these selections at runtime?
AutoGemmParameter.py,
if you want to go deep, you need to read the AutoGemm folder thoroughly .
On Thu, Feb 25, 2016 at 12:20 PM, Mateja Putic notifications@github.com wrote:
I see that there are many kernels already generated in UserGemmKernelSources. How are these kernels selected by the runtime based on the dimensions of the workload?
For example, if I were to call clblasSgemm with N = M = K = 1024, which one of these kernels would be compiled? How does clBLAS make these selections at runtime?
— Reply to this email directly or view it on GitHub https://github.com/clMathLibraries/clBLAS/issues/232#issuecomment-188917380 .
Tingxing dong
OK thank you
I'd like to configure clBLAS to dump OpenCL kernels as they are generated. I set
BLAS_DUMP_CLBLAS_KERNELS
andBLAS_KEEP_KERNEL_SOURCES
inCMakeCache.txt
, in addition toBUILD_SAMPLE
, and I did a build.Then I ran
to make sure it's referencing the library that was just built. However, I don't see any output files.
Searching through the codebase, It seems that the function that is supposed to dump kernels is the
kernelDump
function, which gets called from theenqueueKernel
function. However, I don't see where theenqueueKernel
function gets called.As an additional step, I tried running samples/example_sgemm with gdb, and I set a breakpoint on
enqueueKernel
anddumpKernel
but they were never hit.Is there anything I'm missing to enable dumping of the cl kernels?