Closed Xinyu302 closed 1 year ago
How about the performance of arm64 optimized kernel vs before?
LGTM
How about the performance of arm64 optimized kernel vs before?
There is no arm64 fp16 rotate kernel before. I only add a test in compiler/test/kernel/opr/arm/cv.cpp and pass the correctness test.
I would appreciate it if you can tell me how can I get a kernel's performance in megcc?
It seems that CI/lint failed, I think it's not about me.
@Xinyu302 there are some conflicts, please resolve them and re-run the ci
@Xinyu302 there are some conflicts, please resolve them and re-run the ci
OK.
@Li-Ming-xin
需要在runtime/include/tinycv_c.h
添加rotate cv算子的函数声明,否则用户实际使用时会导致编译失败。
https://github.com/MegEngine/MegCC/blob/eabc1d908eed3796cba6b92d8a7e36ddd2e990c6/compiler/tools/mgb-to-tinynn/mgb-to-tinynn.cpp#L73C5-L73C5 这里应该缺少一些代码,可自行编译mgb-to-tinynn测试一下。 @Xinyu302 ref: https://github.com/MegEngine/MegCC/blob/main/compiler/README.md#building https://github.com/MegEngine/MegCC/blob/main/doc/how-to-use-chinese.md#编写-json-文件
需要在
runtime/include/tinycv_c.h
添加rotate cv算子的函数声明,否则用户实际使用时会导致编译失败。
已经完成了添加
https://github.com/MegEngine/MegCC/blob/eabc1d908eed3796cba6b92d8a7e36ddd2e990c6/compiler/tools/mgb-to-tinynn/mgb-to-tinynn.cpp#L73C5-L73C5 这里应该缺少一些代码,可自行编译mgb-to-tinynn测试一下。 @Xinyu302 ref: https://github.com/MegEngine/MegCC/blob/main/compiler/README.md#building https://github.com/MegEngine/MegCC/blob/main/doc/how-to-use-chinese.md#编写-json-文件
改写yolox_cv.json 为:
{
"dump_dir":"./kernel_yolox_s_arm/",
"models":[
{
"model_name":"yolox_s",
"model_path":"./yolox_s.mge",
"input_shape_str":"data=(1,3,640,640)",
"enable_nchw44":true
}
],
"cv":{
"transpose":["ui8"],
"roicopy":["ui8"],
"rotate":["f16"],
"flip":["ui8"],
"resize_linear":["ui8"],
"warp_affine_replicate_linear":["ui8"],
"rgb2bgr":["ui8"],
"yuv2bgr_nv21":["ui8"],
"rgb2yuv":["ui8"]
}
}
使用命令
mgb-to-tinynn --json=yolox_cv.json --arm64 --dump yolox_gen
进行测试,在yolox_gen文件夹中生成了tinycv_rotate_f16.c
使用命令
mgb-to-tinynn --json=yolox_cv.json --arm64 --dump yolox_gen
进行测试,在yolox_gen文件夹中生成了tinycv_rotate_f16.c
ok,我去确认了一下,之前确实已经有rotate的相关代码了。
根据OSPP的要求,提PR的邮箱需要和OSPP注册邮箱一致,本PR关闭并重新整理提交新的PR @Li-Ming-xin
Add arm64 fp16 rotate and test. OSPP project implementation.