Closed yy2yy closed 8 months ago
请问具体的推理代码是怎样的呢?
请问您是怎么统计性能的呢? 这个代码案例是包括模型初始化在内的。可以参考:
请问您是怎么统计性能的呢? 这个代码案例是包括模型初始化在内的。可以参考:
我只统计的这一个接口的平均时间: model.Predict(im, &res)
关键问题是设置8个线程和不设置的推理时间是一样的,这是怎么回事呢?是我哪里漏掉了什么吗?
我看到推理时间是由30000+ms -> 17000ms,应该是有变化的。建议可以使用paddle yolov8 nano模型,>=s的模型不太适合低功耗或移动端场景。可以尝试以下3个操作:
温馨提示:根据社区不完全统计,按照模板提问,可以加快回复和解决问题的速度
环境
Setting up Android toolchanin
ANDROID_ABI=armeabi-v7a # 'arm64-v8a', 'armeabi-v7a' ANDROID_PLATFORM="android-21" # API >= 21 ANDROID_STL=c++_shared # 'c++_shared', 'c++_static' ANDROID_TOOLCHAIN=clang # 'clang' only TOOLCHAIN_FILE=${ANDROID_NDK}/build/cmake/android.toolchain.cmake
Create build directory
BUILD_ROOT=build/Android BUILD_DIR=${BUILD_ROOT}/${ANDROID_ABI}-api-21 FASDEPLOY_INSTALL_DIR="./install" mkdir build && mkdir ${BUILD_ROOT} && mkdir ${BUILD_DIR} cd ${BUILD_DIR}
CMake configuration with Android toolchain
cmake -DCMAKE_TOOLCHAIN_FILE=${TOOLCHAIN_FILE} \ -DCMAKE_BUILD_TYPE=Release \ -DANDROID_ABI=${ANDROID_ABI} \ -DANDROID_NDK=${ANDROID_NDK} \ -DANDROID_STATIC_LIB=ON \ -DANDROID_PLATFORM=${ANDROID_PLATFORM} \ -DANDROID_STL=${ANDROID_STL} \ -DANDROID_TOOLCHAIN=${ANDROID_TOOLCHAIN} \ -DENABLE_LITE_BACKEND=ON \ -DENABLE_VISION=ON \ -DWITH_ANDROID_OPENMP=ON \ -DWITH_LITE_STATIC=ON \ -DCMAKE_INSTALL_PREFIX=${FASDEPLOY_INSTALL_DIR} \ -Wno-dev ../../..
Build FastDeploy Android C++ SDK
make -j8 make install
问题日志及出现问题的操作流程
附上详细的问题日志有助于快速定位分析
【性能问题】描述清楚对比的方式
./infer_paddle_model_demo ./yolov7_infer/ ./images/test.jpg 0 WARNING: linker: /data/mnntest/install/infer_paddle_model_demo: unsupported flags DT_FLAGS_1=0x8000001 [I 1/ 1 12:18:23.854 ...oid/Paddle-Lite/lite/core/device_info.cc:1275 Setup] ARM multiprocessors name: MODEL NAME : ARMV7 PROCESSOR REV 4 (V7L) HARDWARE : QUALCOMM TECHNOLOGIES, INC SDM450 _QC_REFERENCE_PHONEMSM8953 [I 1/ 1 12:18:23.854 ...oid/Paddle-Lite/lite/core/device_info.cc:1276 Setup] ARM multiprocessors number: 8 [I 1/ 1 12:18:23.854 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 0, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.854 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 1, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.854 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 2, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 3, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 4, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 5, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 6, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1278 Setup] ARM multiprocessors ID: 7, max freq: 1804, min freq: 1804, cluster ID: 0, CPU ARCH: A53 [I 1/ 1 12:18:23.855 ...oid/Paddle-Lite/lite/core/device_info.cc:1284 Setup] L1 DataCache size is: 。。。。。。。。 [FastDeploy][INFO] fastdeploy/runtime/runtime.cc(321)::CreateLiteBackend Runtime initialized with Backend::PDLITE in Device::CPU. frames inference time is : 36341ms frames inference time is : 26378ms frames inference time is : 23049.7ms frames inference time is : 21373ms frames inference time is : 20375.8ms frames inference time is : 19711.3ms frames inference time is : 19237.3ms frames inference time is : 18874ms frames inference time is : 18595.9ms frames inference time is : 18371.2ms frames inference time is : 18190.5ms frames inference time is : 18036.7ms frames inference time is : 17907.8ms frames inference time is : 17798.1ms frames inference time is : 17702.5ms frames inference time is : 17617.5ms frames inference time is : 17543.6ms
查看cpu及内存使用情况,看着好像没有使用多线程处理 PID USER PR NI CPU% S #THR VSS RSS PCY Name 4281 root 20 0 12% R 8 1495960K 1080856K fg ./infer_paddle_model_demo
c++程序应用paddlelite后端,设置: option.SetCpuThreadNum(8),但是推理时没有生效。 请问需要如何配置才能够使用多线程推理一张图片?