Open lovemory opened 4 years ago
<< Vulkan 设备 - Adreno (TM) 506 >> 设备名称: Adreno (TM) 506 设备类型: 集成 GPU 设备 UUID: 25-E7-14-03-43-51-00-00-00-00-00-06-00-05-00-00 设备 ID: 00005143-05000600 内存大小: 2896740 KB 最大 1D 图像大小: 16384 最大 2D 图像大小: 16384 x 16384 最大 3D 图像大小: 2048 x 2048 x 2048 最大 Cube Image Size: 16384 x 16384 最大 Image Layers: 2048 最大 Texel Buffer Elements: 65536 最大 Uniform Buffer Range: 65536 Max Storage Buffer Range: 2147483647 Max Push Constants Size: 128 字节 Max Memory Allocation Count: 4096 Max Sampler Allocation Count: 4000 Buffer Image Granularity: 1 字节 Max Bound Descriptor Sets: 4 Max Per-Stage Descriptor Samplers: 16 Max Per-Stage Descriptor Uniform Buffers: 14 Max Per-Stage Descriptor Storage Buffers: 24 Max Per-Stage Descriptor Sampled Images: 128 Max Per-Stage Descriptor Storage Images: 4 Max Per-Stage Descriptor Input Attachments: 8 Max Per-Stage Resources: 158 Max Descriptor Set Samplers: 96 Max Descriptor Set Uniform Buffers: 84 Max Descriptor Set Dynamic Uniform Buffers: 8 Max Descriptor Set Storage Buffers: 24 Max Descriptor Set Dynamic Storage Buffers: 4 Max Descriptor Set Sampled Images: 768 Max Descriptor Set Storage Images: 24 Max Descriptor Set Input Attachments: 8 Max Vertex Input Attributes: 32 Max Vertex Input Bindings: 32 MaxVertex Input Attribute Offset: 4096 Max Vertex Input Binding Stride: 2048 Max Vertex Output Components: 128 Max Fragment Input Components: 128 Max Fragment Output Attachments: 8 Max Fragment DualSrc Attachments: 1 Max Fragment Combined Output Resources: 72 Max Compute Shared Memory Size: 32 KB Max Compute Work Group Count: X: 65535, Y: 65535, Z: 65535 Max Compute Work Group Invocations: 512 Max Compute Work Group Size: X: 1024, Y: 1024, Z: 64 Subpixel Precision Bits: 4 Subtexel Precision Bits: 8 Mipmap Precision Bits: 8 Max Draw Indexed Index Value: 4294967295 Max Draw Indirect Count: 4294967295 Max Sampler LOD Bias: 15.996094 Max Sampler Anisotropy: 16.000000 Max Viewports: 1 Max Viewport Size: 16384 x 16384 Viewport Bounds Range: -32768.000000 ... 32767.000000 Min Memory Map Alignment: 64 字节 Min Texel Buffer Offset Alignment: 64 字节 Min Uniform Buffer Offset Alignment: 64 字节 Min Storage Buffer Offset Alignment: 64 字节 Min / Max Texel Offset: -8 / 7 Min / Max Texel Gather Offset: -32 / 31 Min / Max Interpolation Offset: -0.500000 / 0.437500 Subpixel Interpolation Offset Bits: 4 Max Framebuffer Size: 16384 x 16384 Max Framebuffer Layers: 2048 Framebuffer Color Sample Counts: 0x00000007 Framebuffer Depth Sample Counts: 0x00000007 Framebuffer Stencil Sample Counts: 0x00000007 Framebuffer No Attachments Sample Counts: 0x00000007 Max Color Attachments: 8 Sampled Image Color Sample Counts: 0x00000007 Sampled Image Integer Sample Counts: 0x00000007 Sampled Image Depth Sample Counts: 0x00000007 Sampled Image Stencil Sample Counts: 0x00000007 Storage Image Sample Counts: 0x00000001 Max Sample Mask Words: 1 Timestamp Period: 52.083332 ns Max Clip Distances: 8 Max Cull Distances: 8 Max Combined Clip and Cull Distances: 8 Discrete Queue Priorities: 3 Point Size Range: 1.000000 ... 1023.000000 Line Width Range: 1.000000 ... 1.000000 Point Size Granularity: 0.062500 Optimal Buffer Copy Offset Alignment: 64 字节 Optimal Buffer Copy Row Pitch Alignment: 64 字节 Non-Coherent Atom Size: 1 字节 API 版本: 1.1.66 Vulkan 库: /system/lib64/libvulkan.so Alpha To One: 支持 Anisotropic Filtering: 支持 ASTC LDR Texture Compression: 支持 BC Texture Compression: 不支持 Depth Bias Clamping: 支持 Depth Bounds Tests: 不支持 Depth Clamping: 支持 Draw Indirect First Instance: 不支持 Dual Source Blend Operations: 支持 ETC2 and EAC Texture Compression: 支持 Fragment Stores and Atomics: 支持 Full Draw Index Uint32: 支持 Geometry Shader: 不支持 Image Cube Array: 支持 Independent Blend: 支持 Inherited Queries: 支持 Large Points: 支持 Logic Operations: 不支持 Multi-Draw Indirect: 支持 Multi Viewport: 不支持 Occlusion Query Precise: 不支持 Pipeline Statistics Query: 不支持 Point and Wireframe Fill Modes: 支持 Robust Buffer Access: 支持 Sample Rate Shading: 支持 Shader Clip Distance: 支持 Shader Cull Distance: 支持 Shader Float64: 不支持 Shader Image Gather Extended: 支持 Shader Int16: 支持 Shader Int64: 不支持 Shader Resource Min LOD: 不支持 Shader Resource Residency: 不支持 Shader Sampled Image Array Dynamic Indexing: 支持 Shader Storage Buffer Array Dynamic Indexing: 支持 Shader Storage Image Array Dynamic Indexing: 支持 Shader Storage Image Extended Formats: 不支持 Shader Storage Image Multisample: 不支持 Shader Storage Image Read Without Format: 不支持 Shader Storage Image Write Without Format: 支持 Shader Tesselation and Geometry Point Size: 不支持 Shader Uniform Buffer Array Dynamic Indexing: 支持 Sparse Binding: 不支持 Sparse Residency 2 Samples: 不支持 Sparse Residency 4 Samples: 不支持 Sparse Residency 8 Samples: 不支持 Sparse Residency 16 Samples: 不支持 Sparse Residency Aliased: 不支持 Sparse Residency Aligned Mip Size: 否 Sparse Residency Buffer: 不支持 Sparse Residency Image 2D: 不支持 Sparse Residency Image 3D: 不支持 Sparse Residency Non-Resident Strict: 否 Sparse Residency Standard 2D Block Shape: 否 Sparse Residency Standard 2D Multisample Block Shape: 否 Sparse Residency Standard 3D Block Shape: 否 Standard Sample Locations: 是 Strict Line Rasterization: 是 Tesselation Shader: 不支持 Timestamps on All Graphics and Compute Queues: 支持 Variable Multisample Rate: 不支持 Vertex Pipeline Stores and Atomics: 支持 Wide Lines: 不支持 设备扩展: VK_KHR_incremental_present VK_KHR_shared_presentable_image VK_GOOGLE_display_timing VK_KHR_swapchain VK_KHR_maintenance1 VK_KHR_maintenance2 VK_KHR_maintenance3 VK_KHR_multiview VK_KHR_variable_pointers VK_KHR_storage_buffer_storage_class VK_KHR_relaxed_block_layout VK_KHR_get_memory_requirements2 VK_KHR_dedicated_allocation VK_KHR_external_memory VK_KHR_external_memory_fd VK_KHR_external_semaphore VK_KHR_external_semaphore_fd VK_KHR_external_fence VK_KHR_external_fence_fd VK_KHR_sampler_ycbcr_conversion VK_KHR_bind_memory2 VK_KHR_shader_draw_parameters VK_KHR_push_descriptor VK_KHR_descriptor_update_template VK_KHR_sampler_mirror_clamp_to_edge VK_ANDROID_external_memory_android_hardware_buffer VK_KHR_device_group VK_EXT_sampler_filter_minmax 实例扩展: VK_KHR_surface VK_KHR_android_surface VK_EXT_swapchain_colorspace VK_KHR_get_surface_capabilities2 VK_EXT_debug_report VK_KHR_get_physical_device_properties2 VK_KHR_external_memory_capabilities VK_KHR_external_semaphore_capabilities VK_KHR_external_fence_capabilities VK_KHR_device_group_creation
Issue description: 47th row: yolov4->opt.use_vulkan_compute = true; detection is Error. 47th row : yolov4->opt.use_vulkan_compute = false; detection is OK.
cpp code copy from ncnn/examples/yolov4.cpp , and modify some lines to support input *.img code is : `// Tencent is pleased to support the open source community by making ncnn available. // // Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved. // // Licensed under the BSD 3-Clause License (the "License"); you may not use this file except // in compliance with the License. You may obtain a copy of the License at // // https://opensource.org/licenses/BSD-3-Clause // // Unless required by applicable law or agreed to in writing, software distributed // under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR // CONDITIONS OF ANY KIND, either express or implied. See the License for the // specific language governing permissions and limitations under the License.
include "net.h"
include <opencv2/core/core.hpp>
include <opencv2/highgui/highgui.hpp>
include <opencv2/imgproc/imgproc.hpp>
include
include
define NCNN_PROFILING
define YOLOV4_TINY //Using yolov4_tiny, if undef, using original yolov4
ifdef NCNN_PROFILING
include "benchmark.h"
endif
struct Object { cv::Rect_ rect;
int label;
float prob;
};
static int init_yolov4(ncnn::Net yolov4, int target_size) { / --> Set the params you need for the ncnn inference <-- /
ifdef YOLOV4_TINY
else
endif
}
static int detect_yolov4(const cv::Mat& bgr, std::vector
} static int frame_count_index= 0; static int draw_objects(const cv::Mat& bgr, const std::vector
}
int main(int argc, char** argv) { cv::Mat frame; std::vector
ifdef NCNN_PROFILING
endif
ifdef NCNN_PROFILING
endif
ifdef NCNN_PROFILING
endif
ifdef NCNN_PROFILING
endif
ifdef NCNN_PROFILING
endif
ifdef NCNN_PROFILING
endif
}
**Android.mk is:**
LOCAL_PATH := $(call my-dir) $(warning "the value of LOCAL_PATH is $(LOCAL_PATH)")include $(CLEAR_VARS) LOCAL_MODULE := opencv-core-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/arm64-v8a/libopencv_core.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-imgcodecs-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/arm64-v8a/libopencv_imgcodecs.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-imgproc-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/arm64-v8a/libopencv_imgproc.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-ittnotify-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/libittnotify.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-tbb-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/libtbb.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := webp-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibwebp.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-IlmImf-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/libIlmImf.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-jpeg-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibjpeg-turbo.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-jasper-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibjasper.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-png-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibpng.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-tiff-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibtiff.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-tegra-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/libtegra_hal.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-dnn-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/arm64-v8a/libopencv_dnn.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-protobuf-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/3rdparty/libs/arm64-v8a/liblibprotobuf.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := opencv-ximgproc-prebuilt LOCAL_SRC_FILES := ../../../../extern/lib/opencv_4.4.0_opencl/arm64-v8a/libopencv_ximgproc.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := libncnn-prebuilt LOCAL_SRC_FILES := ../../../../extern/ncnn/arm64-v8a/libncnn.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := glslang-prebuilt LOCAL_SRC_FILES :=../../../../extern/ncnn/arm64-v8a/libglslang.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := SPIRV-prebuilt LOCAL_SRC_FILES := ../../../../extern/ncnn/arm64-v8a/libSPIRV.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := OGLCompiler-prebuilt LOCAL_SRC_FILES := ../../../../extern/ncnn/arm64-v8a/libOGLCompiler.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := OSDependent-prebuilt LOCAL_SRC_FILES := ../../../../extern/ncnn/arm64-v8a/libOSDependent.a include $(PREBUILT_STATIC_LIBRARY)
include $(CLEAR_VARS) LOCAL_MODULE := ncnn_yolov4tiny_test LOCAL_CFLAGS := -Werror -Wno-write-strings -Dsupportneon -DDEBUG -D_DEBUGPROCTIME -DFCW_ENABLE_YOLO4TINY LOCAL_LDFLAGS := -pie -fPIE
LOCAL_CFLAGS += -fopenmp LOCAL_CPPFLAGS += -fopenmp LOCAL_LDFLAGS += -fopenmp
LOCAL_SRC_FILES += ../adas_vehicle_det_ncnn_yolov4.cpp LOCAL_C_INCLUDES += ${LOCAL_PATH}/../../../../extern/include/opencv_4.4.0 LOCAL_C_INCLUDES += ${LOCAL_PATH}/../../../../extern/ncnn/include/ncnn
LOCAL_LDLIBS := -lz -llog
LOCAL_LDLIBS := -lz -llog -ljnigraphics -lvulkan -landroid CXXFLAGS := -D_GLIBCXX_DEBUG -O2 LOCAL_STATIC_LIBRARIES += opencv-dnn-prebuilt LOCAL_STATIC_LIBRARIES += opencv-imgcodecs-prebuilt LOCAL_STATIC_LIBRARIES += opencv-ximgproc-prebuilt LOCAL_STATIC_LIBRARIES += opencv-imgproc-prebuilt LOCAL_STATIC_LIBRARIES += opencv-core-prebuilt LOCAL_STATIC_LIBRARIES += opencv-protobuf-prebuilt LOCAL_STATIC_LIBRARIES += opencv-ittnotify-prebuilt LOCAL_STATIC_LIBRARIES += opencv-tbb-prebuilt LOCAL_STATIC_LIBRARIES += opencv-IlmImf-prebuilt LOCAL_STATIC_LIBRARIES += opencv-jasper-prebuilt LOCAL_STATIC_LIBRARIES += opencv-jpeg-prebuilt LOCAL_STATIC_LIBRARIES += opencv-png-prebuilt LOCAL_STATIC_LIBRARIES += opencv-tiff-prebuilt LOCAL_STATIC_LIBRARIES += opencv-tegra-prebuilt LOCAL_STATIC_LIBRARIES += webp-prebuilt LOCAL_STATIC_LIBRARIES += libncnn-prebuilt LOCAL_STATIC_LIBRARIES += glslang-prebuilt LOCAL_STATIC_LIBRARIES += SPIRV-prebuilt LOCAL_STATIC_LIBRARIES += OGLCompiler-prebuilt LOCAL_STATIC_LIBRARIES += OSDependent-prebuilt include $(BUILD_EXECUTABLE)
Application.mk is: APP_ABI := arm64-v8a APP_STL := c++_shared APP_CPPFLAGS := -frtti -fexceptions APP_PLATFORM := android-24
test output:
47th row : yolov4->opt.use_vulkan_compute = false; detection is OK.
mercury:/data/local/tmp/200811120236 $ ./ncnn_yolov4tiny_test input_image/input_img_00001.jpg NCNN Init time 6174.19ms file: input_image/input_img_00001.jpg i = 1 start out.h = 1 NCNN detection time 650.85ms 8 = 0.41767 at 1249.40 19.08 651.86 x 740.22 NCNN OpenCV draw result time 308.95ms file: input_image/input_img_00001.jpg i = 1 end
47th row: yolov4->opt.use_vulkan_compute = true; detection is Error.
mercury:/data/local/tmp/200811120236 $ ./ncnn_yolov4tiny_test input_image/input_img_00001.jpg [0 Adreno (TM) 506] queueC=0[3] queueG=0[3] queueT=0[3] [0 Adreno (TM) 506] bugsbn1=1 buglbia=0 bugcopc=0 bugihfa=0 [0 Adreno (TM) 506] fp16p=1 fp16s=0 fp16a=0 int8s=0 int8a=0 NCNN Init time 11329.38ms file: input_image/input_img_00001.jpg i = 1 start out.h = 1160 NCNN detection time 1696.75ms 1 = 0.33333 at 915.46 738.17 162.94 x 143.67 1 = 0.33333 at 841.61 738.17 162.94 x 143.67 1 = 0.33333 at 1063.15 738.17 162.94 x 143.67 1 = 0.33333 at 989.30 738.17 162.94 x 143.67 1 = 0.33333 at 767.76 738.17 162.94 x 143.67 1 = 0.33333 at 546.22 738.17 162.94 x 143.67 1 = 0.33333 at 472.38 738.17 162.94 x 143.67 1 = 0.33333 at 693.92 738.17 162.94 x 143.67 1 = 0.33333 at 620.07 738.17 162.94 x 143.67 1 = 0.33333 at 1136.99 738.17 162.94 x 143.67 1 = 0.33333 at 1653.92 738.17 162.94 x 143.67 1 = 0.33333 at 1580.07 738.17 162.94 x 143.67 1 = 0.33333 at 1801.61 738.17 162.94 x 143.67 1 = 0.33333 at 1727.76 738.17 162.94 x 143.67 1 = 0.33333 at 1506.22 738.17 162.94 x 143.67 1 = 0.33333 at 1284.69 738.17 162.94 x 143.67 1 = 0.33333 at 1210.84 738.17 162.94 x 143.67 1 = 0.33333 at 1432.38 738.17 162.94 x 143.67 1 = 0.33333 at 1358.53 738.17 162.94 x 143.67
`