xiangweizeng / darknet2ncnn

Darknet2ncnn converts the darknet model to the ncnn model
Do What The F*ck You Want To Public License
158 stars 56 forks source link

How to troubleshoot problem when benchdarknet program runs slow? #37

Closed thtfpcuser closed 4 years ago

thtfpcuser commented 4 years ago

I have already turned on neon option. The benchmark is running on an imx6q devboard.

root@imx6qsabresd:/home/benchmark# ./benchdarknet loop_count = 4 num_threads = 1 powersave = 0 gpu_device = -1 cifar min = 3984.53 max = 3986.34 avg = 3985.29 alexnet min = 6808.87 max = 6919.03 avg = 6862.98 darknet min = 2491.22 max = 2502.07 avg = 2496.54

This is benchdarknet file attributes, from readelf ouputs. File Attributes Tag_CPU_name: "Cortex-A9" Tag_CPU_arch: v7 Tag_CPU_arch_profile: Application Tag_ARM_ISA_use: Yes Tag_THUMB_ISA_use: Thumb-2 Tag_FP_arch: VFPv3 Tag_Advanced_SIMD_arch: NEONv1 Tag_ABI_PCS_wchar_t: 4 Tag_ABI_FP_rounding: Needed Tag_ABI_FP_denormal: Needed Tag_ABI_FP_exceptions: Needed Tag_ABI_FP_number_model: IEEE 754 Tag_ABI_align_needed: 8-byte Tag_ABI_align_preserved: 8-byte, except leaf SP Tag_ABI_enum_size: int Tag_ABI_VFP_args: VFP registers Tag_CPU_unaligned_access: v6 Tag_MPextension_use: Allowed Tag_Virtualization_use: TrustZone

thtfpcuser commented 4 years ago

one troubleshooting point: in case of cross building ncnn library, CMAKE_SYSTEM_NAME must be defined, otherwise the value of CMAKE_SYSTEM_PROCESSOR is wrong.

xiangweizeng commented 4 years ago

Make sure that the part of the inline assembly is compiled, and the number of threads can be set to 4.