mtx512 / rk3588-npu

Reverse engineering the rk3588 npu
GNU General Public License v3.0
57 stars 3 forks source link

run on rk3568 #3

Open rigdo opened 3 months ago

rigdo commented 3 months ago

Hi, I'm trying to run your code on rk3568. but when I run ./matmul_fp16_test 1 32 16 I get "RKNPU_SUBMIT returned -1" and the output in dmesg:

RKNPU: job: 000000004eb23a7b, iommu domain id: 0, wait_count: 1, continue wait: 0, commit elapse time: 6251358us, wait time: 6251362us, timeout: 6000000us
RKNPU: failed to wait job, task counter: 0, flags: 0x5, ret = 0, elapsed time: 6252728us
RKNPU: job timeout, flags: 0x0:
RKNPU:  core 0 irq status: 0x0, raw status: 0x0, require mask: 0x300, task counter: 0x0, elapsed time: 6359758us

I added a register dump to the rknpu.ko driver and run rknn_matmul_api_demo (from rknn toolkit2) by doing ./rknn_matmul_api_demo 1 1 32 16 0 0 1 1 1, in the rknpu.ko driver, in rknpu_job.c in the rknpu_job_subcore_commit_pc function, the first_task->regcmd_addr dump looks like this: (full dump in rknn_matmul_api_demo_dump.txt )

[  302.972445]   0: 0x02010000000e1004 ( op: 0x0201 reg: 0x1004 value: 0x0000000e )
[  302.973188]   1: 0x04010000000e2004 ( op: 0x0401 reg: 0x2004 value: 0x0000000e )
[  302.973840]   2: 0x08010000000e3004 ( op: 0x0801 reg: 0x3004 value: 0x0000000e )
[  302.974668]   3: 0x0201000000711040 ( op: 0x0201 reg: 0x1040 value: 0x00000071 )
[  302.975319]   4: 0x10010000000e4004 ( op: 0x1001 reg: 0x4004 value: 0x0000000e )
[  302.975965]   5: 0x020100000220100c ( op: 0x0201 reg: 0x100c value: 0x00000220 )
[  302.976611]   6: 0x0201000000101010 ( op: 0x0201 reg: 0x1010 value: 0x00000010 )
[  302.977256]   7: 0x0201000000091014 ( op: 0x0201 reg: 0x1014 value: 0x00000009 )
[  302.977903]   8: 0x0201000000001018 ( op: 0x0201 reg: 0x1018 value: 0x00000000 )
...

as I understand from your source codes op 0x0201 is OP_REG_CNA, but what does 0x0401 mean - is there a place where I can read about it?

mtx512 commented 3 months ago

Hi, RK3568 NPU hw is slightly different to RK3588. I suggest reviewing the RK3568 TRM (part 2) it should give you an idea of the register values. Then follow my code and it should be similar.