use_packing_layout关闭后的一个问题

hubin858130 commented 2 years ago

animeGanV2的模型在android端推理的时候，我设置如下 ncnn::Option opt; opt.lightmode = true; opt.use_vulkan_compute = false; opt.use_packing_layout = false; 出现了一个奇怪的问题，在arm32的架构下面，运行正常，在arm64的架构下，一次推理时间比32位架构下慢了好几倍，比如小米10在32位上3秒，在64位上30秒；同时在64位架构上差机型比好机型反而表现优越，比如小米青春版8在64上做一次15s ,而小米10要30秒。当然我试过vulkan加速，64位又能回到3～4秒的样子，但是很多机型推理一半后直接提示硬件丢失，推理失败，兼容性有点问题。麻烦问下大佬，关闭vulkan和packed memory layout后，64位架构下性能为啥比32位架构下差了很多～ @nihui

nihui commented 2 years ago

试试关闭fp16，使用fp32推理

opt.use_fp16_storage = false;
opt.use_fp16_arithmetic = false;

hubin858130 commented 2 years ago

试试关闭fp16，使用fp32推理
opt.use_fp16_storage = false;
opt.use_fp16_arithmetic = false;

感谢大佬huihui

Tencent / ncnn

use_packing_layout关闭后的一个问题 #3482