Closed stevef1uk closed 6 months ago
Is it CPU or GPU imlementation? I generally CPU based OCL implementations aren't really tested or working or helpful
Is it CPU or GPU imlementation? I generally CPU based OCL implementations aren't really tested or working or helpful
GPU I thought?
. clinfo output:
(env) pi@rpi5a:/mnt/llm/pytorch_dlprim $ clinfo Number of platforms 1 Platform Name Portable Computing Language Platform Vendor The pocl project Platform Version OpenCL 3.0 PoCL 5.1-pre main-0-g6c3c974b Linux, Debug+Asserts, RELOC, SPIR, LLVM 14.0.6, SLEEF, POCL_DEBUG Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_pocl_content_size Platform Extensions with Version cl_khr_icd 0x400000 (1.0.0) cl_pocl_content_size 0x400000 (1.0.0) Platform Numeric Version 0xc00000 (3.0.0) Platform Extensions function suffix POCL Platform Host timer resolution 0ns
Platform Name Portable Computing Language Number of devices 1 Device Name cpu-cortex-a76-0xd0b Device Vendor ARM Device Vendor ID 0x13b5 Device Version OpenCL 3.0 PoCL HSTR: cpu-aarch64-unknown-linux-gnu-cortex-a76 Device Numeric Version 0xc00000 (3.0.0) Driver Version 5.1-pre main-0-g6c3c974b Device OpenCL C Version OpenCL C 1.2 PoCL Device OpenCL C all versions OpenCL C
POCL is not supported its implementation is limited and the kernel aren't CPU optimized.
Trying to use Pytorch with OpenCL backend on a RPI 5 to use GPU (probably far too slow but anyway)
Build the backend as instructed. Getting
with python mnist.py --device=ocl:0
getting:
File "/mnt/llm/env/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Failed to build program source sgemm with parameters -cl-std=CL2.0 -DTILE_SIZE_M=64 -DTILE_SIZE_N=64 -DBLOCK_SIZE_M=8 -DBLOCK_SIZE_N=8 -DTILE_SIZE_K=4 -DTILE_OFFSET=0 -DBIAS=0 -DATRANS=0 -DBTRANS=1 -DIM2COL_OCHAN=676 -DCONVGEMM=1 -DKERN_H=3 -DKERN_W=3 -DDILATE_H=1 -DDILATE_W=1 -DPAD_H=0 -DPAD_W=0 -DSTRIDE_H=1 -DSTRIDE_W=1 -DGROUPS=1 -DCHANNELS_IN=1 -DSRC_COLS=28 -DSRC_ROWS=28 -DIMG_COLS=26 -DIMG_ROWS=26 -DREDUCE_K=1 -DACTIVATION=0 log: For device: cpu-cortex-a76-0xd0b Build option -cl-std specified OpenCL C version 2.0,but device cpu doesn't support that OpenCL C version.
clBuildErrorCode: -11