ARM-software / armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
https://developer.arm.com/products/processors/machine-learning/arm-nn
MIT License
1.17k stars 310 forks source link

ArmNN v22.02 UnitTests failure with Mali-G31 and OpenCL #627

Closed GW-Renesas closed 2 years ago

GW-Renesas commented 2 years ago

Hi All,

I'm running UnitTests on a platform with duel Cortex-A55 CPUs and a Mali-G31. Neon and OpenCL are enabled. The build is also being completed from a Yocto environment.

I get the following output:

./UnitTests -- --dynamic-backend-build-dir 
"/usr/bin/armnn/examples/UnitTests/"
[doctest] doctest version is \"2.4.6\"
[doctest] run with \"--help\" for options 
===============================================================================
/usr/src/debug/armnn/22.02-r0/git/src/backends/cl/test/ClEndToEndTests.cpp:512:
TEST SUITE: ClEndToEnd
TEST CASE:  ClQLstmEndToEndTest
/usr/src/debug/armnn/22.02-r0/git/src/backends/cl/test/ClEndToEndTests.cpp:512: ERROR: test case THREW exception: Failed to assign a backend to each layer 
===============================================================================
[doctest] test cases:   4816 |   4815 passed | 1 failed | 0 skipped
[doctest] assertions: 807610 | 807610 passed | 0 failed | [doctest] 
Status: FAILURE!

Is there anything obvious that I have missed?

Thanks, Gareth

MatthewARM commented 2 years ago

At a guess I'd say that OpenCL isn't working on the device. Does clinfo show that the Mali-G31 is providing OpenCL?

GW-Renesas commented 2 years ago

I get the following output which indicates the Mali-G31 is providing OpenCL:

  Platform Name                                   ARM Platform
Number of devices                                 1
  Device Name                                     Mali-G31 r0p0
  Device Vendor                                   ARM
  Device Vendor ID                                0x70930000
  Device Version                                  OpenCL 3.0 v1.r32p0-01eac0.d268f473e71e10711a350839d8d94e14

however I also get the following at the end of the clinfo output:

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31 r0p0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31 r0p0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31 r0p0
        NOTE:   your OpenCL library only supports OpenCL 2.1,
                but some installed platforms support OpenCL 3.0.
                Programs using 3.0 features may crash
                or behave unexpectedly

The versioning information appears relevant:

your OpenCL library only supports OpenCL 2.1,

Does this line indicate the version of the driver and OpenCL library need to be v3.0?

MatthewARM commented 2 years ago

That's very interesting, I've not seen that before. Maybe ClQLstmEndToEndTest needs OpenCL 3.0.

Ah, now I look at the test output, I think it's just this one test which is failing. @morgolock is it possible that Quantized LSTM requires OpenCL 3.0?

morgolock commented 2 years ago

Hi @MatthewARM

No, ACL requires OpenCL 1.2 and non uniform workgroup size extension.

I'm trying to reproduce the error on a G31.

GW-Renesas commented 2 years ago

Hi @morgolock,

Were you able to reproduce this issue on a G31 at all?

morgolock commented 2 years ago

Hi @GW-Renesas

Yes, I reproduced on G31. I'm looking into it.

morgolock commented 2 years ago

Hi @GW-Renesas

The following patch fixes the problem: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8093

Hope this helps.

GW-Renesas commented 2 years ago

Hi @morgolock

That is great news, I will test on my platform shortly.

thanks

GW-Renesas commented 2 years ago

Hi @morgolock

Many thanks, this has solved the backend assignment error on my platform. Recently I also see similar layer related check errors, has this been seen before?

neon_unit_errors.txt

morgolock commented 2 years ago

Hi @GW-Renesas

The failures mentioned in the attachment seem unrelated to QLSTM and the OpenCL backend, I'd suggest closing this issue and creating a new one.

Hope this helps.

GW-Renesas commented 2 years ago

Hi @morgolock,

No problem, thanks for the assistance.