microsoft / OpenCLOn12

The OpenCL-on-D3D12 mapping layer
MIT License
104 stars 13 forks source link

compiler freezing and clinfo bug #60

Open tangjinchuan opened 2 months ago

tangjinchuan commented 2 months ago

Dear MS team, I have a kernel file which can be compiled with AMD 7800XT GPU using less than 100MB memory during compilation using AMD driver. However, by using the OpenCOn12 runtime (the latest version), it hangs during compiling while the memory is huge. Please find the kernel and the exe to run the kernel attached. The freezing kernel is mainly due to BNField12.cl.

clTest.zip

In addition, I also found out that the clinfo program reported the following errors: Platform Name: OpenCLOn12 Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Max compute units: 1 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 64 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 2 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 2 Max clock frequency: 12Mhz Address bits: 64 Max memory allocation: 1073741824 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 64 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 1024 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: Yes Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 16957956096 Constant buffer size: 65536 Max number of constant args: 15 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 0 Max pipe active reservations: 0 Max pipe packet size: 0 Max global variable size: 0 Max global variable preferred total size: 0 Max read/write image args: 64 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 ERROR: clCreateKernel(-5)

C:\Users\Owner>