radxa / kernel

BSP kernel source
Other
134 stars 166 forks source link

rknpu driver 0.9.3 bug, upgrade to 0.9.6 to fix #312

Open swdee opened 2 months ago

swdee commented 2 months ago

When trying out the Radxa Linux 6.1 (bookworm) builds for Rock 5B and CM5 I experience an error with the RKNPU driver where the vendor MobileNet demo fails with the error.

model input num: 1, output num: 1
input tensors:
  index=0, name=input, n_dims=4, dims=[1, 224, 224, 3], n_elems=150528, size=150528, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007812
output tensors:
  index=0, name=MobilenetV1/Predictions/Reshape_1, n_dims=2, dims=[1, 1001, 0, 0], n_elems=1001, size=1001, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003906
rknn_run
E RKNN: [05:03:32.244] failed to submit!, op id: 1, op name: Conv:MobilenetV1/MobilenetV1/Conv2d_0/Relu6_prequant, flags: 0x5, task start: 0, task number: 38, run task counter: 0, int status: 0, please try updating to the latest version of the toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn)
rknn_run fail! ret=-1

Under the 5.10 (bulleye) image this RKNN failed to submit error does not occur.

The following RKNPU drivers are present for each distribution.

Linux Version Driver Version
5.10 driver version: 0.8.2, API version: 1.6.0 (9a7b5d24c@2023-12-13T17:31:11)
6.1 driver version: 0.9.3, API version: 1.6.0 (9a7b5d24c@2023-12-13T17:31:11)

Under the RKNN LLM repo a newer driver version 0.9.6 is available. I rebuilt the 6.1 Linux image with this newer driver and patched the build problems for vm_flags_set and vm_flags_clear as referenced in this issue. This newer driver now fixes the above problem and Mobilenet demo runs as expected with output of;

model input num: 1, output num: 1
input tensors:
  index=0, name=input, n_dims=4, dims=[1, 224, 224, 3], n_elems=150528, size=150528, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007812
output tensors:
  index=0, name=MobilenetV1/Predictions/Reshape_1, n_dims=2, dims=[1, 1001, 0, 0], n_elems=1001, size=1001, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003906
rknn_run
 --- Top5 ---
283: 0.468750
282: 0.242188
286: 0.105469
464: 0.089844
264: 0.019531

I note that Armbian and Joshua-Riek Ubuntu images have recently upgraded to the 0.9.6 driver. I request that Radxa updates their official images to the 0.9.6 driver so NPU is usable again.

swdee commented 1 month ago

I see Radxa has now commited these updates to mainline.

0.9.6 Update commit and vm_flags patch.