artyom-beilis / pytorch_dlprim

DLPrimitives/OpenCL out of tree backend for pytorch
http://blog.dlprimitives.org/
MIT License
227 stars 16 forks source link

Issue with OpenCL 3.0 #51

Closed stevef1uk closed 6 months ago

stevef1uk commented 6 months ago

Trying to use Pytorch with OpenCL backend on a RPI 5 to use GPU (probably far too slow but anyway)

Build the backend as instructed. Getting

with python mnist.py --device=ocl:0

getting:

File "/mnt/llm/env/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Failed to build program source sgemm with parameters -cl-std=CL2.0 -DTILE_SIZE_M=64 -DTILE_SIZE_N=64 -DBLOCK_SIZE_M=8 -DBLOCK_SIZE_N=8 -DTILE_SIZE_K=4 -DTILE_OFFSET=0 -DBIAS=0 -DATRANS=0 -DBTRANS=1 -DIM2COL_OCHAN=676 -DCONVGEMM=1 -DKERN_H=3 -DKERN_W=3 -DDILATE_H=1 -DDILATE_W=1 -DPAD_H=0 -DPAD_W=0 -DSTRIDE_H=1 -DSTRIDE_W=1 -DGROUPS=1 -DCHANNELS_IN=1 -DSRC_COLS=28 -DSRC_ROWS=28 -DIMG_COLS=26 -DIMG_ROWS=26 -DREDUCE_K=1 -DACTIVATION=0 log: For device: cpu-cortex-a76-0xd0b Build option -cl-std specified OpenCL C version 2.0,but device cpu doesn't support that OpenCL C version.

clBuildErrorCode: -11

artyom-beilis commented 6 months ago

Is it CPU or GPU imlementation? I generally CPU based OCL implementations aren't really tested or working or helpful

stevef1uk commented 6 months ago

Is it CPU or GPU imlementation? I generally CPU based OCL implementations aren't really tested or working or helpful

GPU I thought?

. clinfo output:

(env) pi@rpi5a:/mnt/llm/pytorch_dlprim $ clinfo Number of platforms 1 Platform Name Portable Computing Language Platform Vendor The pocl project Platform Version OpenCL 3.0 PoCL 5.1-pre main-0-g6c3c974b Linux, Debug+Asserts, RELOC, SPIR, LLVM 14.0.6, SLEEF, POCL_DEBUG Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_pocl_content_size Platform Extensions with Version cl_khr_icd 0x400000 (1.0.0) cl_pocl_content_size 0x400000 (1.0.0) Platform Numeric Version 0xc00000 (3.0.0) Platform Extensions function suffix POCL Platform Host timer resolution 0ns

Platform Name Portable Computing Language Number of devices 1 Device Name cpu-cortex-a76-0xd0b Device Vendor ARM Device Vendor ID 0x13b5 Device Version OpenCL 3.0 PoCL HSTR: cpu-aarch64-unknown-linux-gnu-cortex-a76 Device Numeric Version 0xc00000 (3.0.0) Driver Version 5.1-pre main-0-g6c3c974b Device OpenCL C Version OpenCL C 1.2 PoCL Device OpenCL C all versions OpenCL C

ICD loader properties ICD loader Name OpenCL ICD Loader ICD loader Vendor OCL Icd free software ICD loader Version 2.3.1 ICD loader Profile OpenCL 3.0
artyom-beilis commented 6 months ago

POCL is not supported its implementation is limited and the kernel aren't CPU optimized.