airockchip / rknn-toolkit2

Other
955 stars 101 forks source link

InputOperator, OutputOperator #179

Open hhd-shuai opened 1 month ago

hhd-shuai commented 1 month ago

Hi, I am working on deploying a model on the RK3588 and performing inference. I found that the InputOperator and OutputOperator run on the CPU, consuming a lot of resources and time. Could you tell me what processing is done inside these operators? Are they performing tensor dimension transformations? Is there a way to execute these operators on the NPU? image image If so, would it be possible to provide some guidance.

yuyun2000 commented 1 month ago

输入输出是把数据在cpu和npu之间切换的,不过你这个竟然要6秒,有点夸张了,实测也是这么慢吗?用的什么版本?

hhd-shuai commented 1 month ago

感谢解答!版本是 RKNN Model Information: version: 6, toolkit version: 1.5.2+b642f30c(compiler version: 1.5.2 (c6b7b351a@2023-08-23T07:30:34)) RKNN Driver Information: version: 0.9.2 速度还好,它这个是微秒,主要是cpu占用较高。我的模型head数量多,叠加起来的数据搬运cpu占用就比较高了。另外请教一下,如果是NPU不支持的算子,内部是否会将数据拷贝到CPU执行完了再拷贝回NPU呢?rknn_infer接口里面的CPU占用,主要来源于输入输出把数据在cpu和npu之间切换,还是这种不支持算子带来的

yuyun2000 commented 1 month ago

是不支持的算子,导致数据在npu和cpu之间来回切换从而使推理时长变多

hhd-shuai commented 1 month ago

好的,感谢!