alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.8k stars 1.68k forks source link

Windows如何确定是否使用了OpenCL?? #1463

Closed jnulzl closed 3 years ago

jnulzl commented 3 years ago

问题描述:

我在Windows下MNNForwardType设置为MNN_FORWARD_OPENCL和MNN_FORWARD_CPU,速度基本一样。
设为MNN_FORWARD_OPENCL也没有类似下面的报错

Can't Find type=7 backend, use 0 instead //设为MNN_FORWARD_VULKAN会报该错误

难道我这里MNNForwardType设为MNN_FORWARD_OPENCL跑的还是cpu?

Win10 VS2019 16.9.3版本编译的MNN1.1.6

编译步骤参考这里

用GPU Caps Viewer查看的GPU信息如下:

===================================================
GPU Caps Viewer v1.50.1.0 report
http://www.geeks3d.com
http://www.ozone3d.net/gpu_caps_viewer/
===================================================

===================================[ System / CPU ]
- CPU Name: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
- CPU Core Speed: 3600 MHz
- CPU logical cores: 8
- Family: 6 - Model: 14 - Stepping: 9
- Physical Memory Size: 16384 MB
- Operating System: Windows 10 64-bit build 17763
- PhysX Version: drivers not installed

===================================[ Graphics Adapters / GPUs ]
- Current Display Mode: 1920x1080 @ 60 Hz - 32 bpp
- Num GPUs: 1

- GPU 1
  - Name: AMD Radeon R7 200 Series
  - GPU codename: Oland
  - Device ID: 1002-6611
  - Subdevice ID: 1462-3371
  - Revision ID:   87
  - Driver: Adrenalin 2020 21.4.1 (27.20.21002.112)
  - Bus Id: 1
  - Shader cores: 384
  - Texture units: 24
  - ROP units: 8
  - TDP: 50W
  - BIOS version: 015.049.000.017/113-EXT19246-001 (2017/11/21 05:17)
  - Memory size: 2048MB
  - Memory type: GDDR5
  - Memory bus width: 0-bit
  - GPU clock: 300 MHz, memory: 300 MHz, VDDC: 0.800V
  - GPU clock: 780 MHz, memory: 1100 MHz, VDDC: 1.150V
  - Radeon PowerTune info:
    - GPU power min: 80 % TDP
    - GPU power max: 120 % TDP

===================================[ OpenGL GPU Capabilities ]
- GL_VENDOR: ATI Technologies Inc.
- GL_RENDERER: AMD Radeon R7 430
- GL_VERSION: 4.6.14830 Compatibility Profile/Debug Context 21.4.1 27.20.21002.112
- GL_SHADING_LANGUAGE_VERSION: 4.60
- ARB Texture Units: 8
- Vertex Shader Texture Units: 32
- Pixel Shader Texture Units: 32
- Geometry Shader Texture Units: 32
- Max Texture Size: 16384x16384
- Max Anisotropic Filtering Value: X16.0
- Max Point Sprite Size: 8192.0
- Max Dynamic Lights: 8
- Max Viewport Size: 16384x16384
- Max Vertex Uniform Components: 16384
- Max Fragment Uniform Components: 16384
- Max Geometry Uniform Components: 16384
- Max Varying Float: 128
- Max Vertex Bindable Uniforms: 15
- Max Fragment Bindable Uniforms: 15
- Max Geometry Bindable Uniforms: 15
- Frame Buffer Objects (FBO) Support:[yes]
- Multiple Render Targets / Max draw buffers: 8
- Pixel Buffer Objects (PBO) Support:[yes]
- S3TC Texture Compression Support:[yes]
- ATI 3Dc Texture Compression Support:[yes]
- Texture Rectangle Support:[yes]
- Floating Point Textures Support:[yes]
- MSAA: 2X
- MSAA: 4X
- MSAA: 8X
- OpenGL Extensions: 324 extensions (GL=299 and WGL=25)
  - GL_AMDX_debug_output
  - GL_AMD_blend_minmax_factor
  - GL_AMD_conservative_depth
  - GL_AMD_debug_output
  - GL_AMD_depth_clamp_separate
  - GL_AMD_draw_buffers_blend
  - GL_AMD_framebuffer_sample_positions
  - GL_AMD_gcn_shader
  - GL_AMD_gpu_shader_int64
  - GL_AMD_interleaved_elements
  - GL_AMD_multi_draw_indirect
  - GL_AMD_name_gen_delete
  - GL_AMD_performance_monitor
  - GL_AMD_pinned_memory
  - GL_AMD_query_buffer_object
  - GL_AMD_sample_positions
  - GL_AMD_seamless_cubemap_per_texture
  - GL_AMD_shader_atomic_counter_ops
  - GL_AMD_shader_stencil_export
  - GL_AMD_shader_stencil_value_export
  - GL_AMD_shader_trace
  - GL_AMD_shader_trinary_minmax
  - GL_AMD_sparse_texture
  - GL_AMD_stencil_operation_extended
  - GL_AMD_texture_cube_map_array
  - GL_AMD_texture_texture4
  - GL_AMD_transform_feedback3_lines_triangles
  - GL_AMD_transform_feedback4
  - GL_AMD_vertex_shader_layer
  - GL_AMD_vertex_shader_viewport_index
  - GL_ARB_ES2_compatibility
  - GL_ARB_ES3_1_compatibility
  - GL_ARB_ES3_compatibility
  - GL_ARB_arrays_of_arrays
  - GL_ARB_base_instance
  - GL_ARB_bindless_texture
  - GL_ARB_blend_func_extended
  - GL_ARB_buffer_storage
  - GL_ARB_clear_buffer_object
  - GL_ARB_clear_texture
  - GL_ARB_clip_control
  - GL_ARB_color_buffer_float
  - GL_ARB_compatibility
  - GL_ARB_compressed_texture_pixel_storage
  - GL_ARB_compute_shader
  - GL_ARB_conditional_render_inverted
  - GL_ARB_conservative_depth
  - GL_ARB_copy_buffer
  - GL_ARB_copy_image
  - GL_ARB_cull_distance
  - GL_ARB_debug_output
  - GL_ARB_depth_buffer_float
  - GL_ARB_depth_clamp
  - GL_ARB_depth_texture
  - GL_ARB_derivative_control
  - GL_ARB_direct_state_access
  - GL_ARB_draw_buffers
  - GL_ARB_draw_buffers_blend
  - GL_ARB_draw_elements_base_vertex
  - GL_ARB_draw_indirect
  - GL_ARB_draw_instanced
  - GL_ARB_enhanced_layouts
  - GL_ARB_explicit_attrib_location
  - GL_ARB_explicit_uniform_location
  - GL_ARB_fragment_coord_conventions
  - GL_ARB_fragment_layer_viewport
  - GL_ARB_fragment_program
  - GL_ARB_fragment_program_shadow
  - GL_ARB_fragment_shader
  - GL_ARB_framebuffer_no_attachments
  - GL_ARB_framebuffer_object
  - GL_ARB_framebuffer_sRGB
  - GL_ARB_geometry_shader4
  - GL_ARB_get_program_binary
  - GL_ARB_get_texture_sub_image
  - GL_ARB_gl_spirv
  - GL_ARB_gpu_shader5
  - GL_ARB_gpu_shader_fp64
  - GL_ARB_half_float_pixel
  - GL_ARB_half_float_vertex
  - GL_ARB_imaging
  - GL_ARB_indirect_parameters
  - GL_ARB_instanced_arrays
  - GL_ARB_internalformat_query
  - GL_ARB_internalformat_query2
  - GL_ARB_invalidate_subdata
  - GL_ARB_map_buffer_alignment
  - GL_ARB_map_buffer_range
  - GL_ARB_multi_bind
  - GL_ARB_multi_draw_indirect
  - GL_ARB_multisample
  - GL_ARB_multitexture
  - GL_ARB_occlusion_query
  - GL_ARB_occlusion_query2
  - GL_ARB_parallel_shader_compile
  - GL_ARB_pipeline_statistics_query
  - GL_ARB_pixel_buffer_object
  - GL_ARB_point_parameters
  - GL_ARB_point_sprite
  - GL_ARB_polygon_offset_clamp
  - GL_ARB_program_interface_query
  - GL_ARB_provoking_vertex
  - GL_ARB_query_buffer_object
  - GL_ARB_robust_buffer_access_behavior
  - GL_ARB_sample_shading
  - GL_ARB_sampler_objects
  - GL_ARB_seamless_cube_map
  - GL_ARB_seamless_cubemap_per_texture
  - GL_ARB_separate_shader_objects
  - GL_ARB_shader_atomic_counter_ops
  - GL_ARB_shader_atomic_counters
  - GL_ARB_shader_ballot
  - GL_ARB_shader_bit_encoding
  - GL_ARB_shader_draw_parameters
  - GL_ARB_shader_group_vote
  - GL_ARB_shader_image_load_store
  - GL_ARB_shader_image_size
  - GL_ARB_shader_objects
  - GL_ARB_shader_precision
  - GL_ARB_shader_stencil_export
  - GL_ARB_shader_storage_buffer_object
  - GL_ARB_shader_subroutine
  - GL_ARB_shader_texture_image_samples
  - GL_ARB_shader_texture_lod
  - GL_ARB_shader_viewport_layer_array
  - GL_ARB_shading_language_100
  - GL_ARB_shading_language_420pack
  - GL_ARB_shading_language_include
  - GL_ARB_shading_language_packing
  - GL_ARB_shadow
  - GL_ARB_shadow_ambient
  - GL_ARB_sparse_buffer
  - GL_ARB_sparse_texture
  - GL_ARB_spirv_extensions
  - GL_ARB_stencil_texturing
  - GL_ARB_sync
  - GL_ARB_tessellation_shader
  - GL_ARB_texture_barrier
  - GL_ARB_texture_border_clamp
  - GL_ARB_texture_buffer_object
  - GL_ARB_texture_buffer_object_rgb32
  - GL_ARB_texture_buffer_range
  - GL_ARB_texture_compression
  - GL_ARB_texture_compression_bptc
  - GL_ARB_texture_compression_rgtc
  - GL_ARB_texture_cube_map
  - GL_ARB_texture_cube_map_array
  - GL_ARB_texture_env_add
  - GL_ARB_texture_env_combine
  - GL_ARB_texture_env_crossbar
  - GL_ARB_texture_env_dot3
  - GL_ARB_texture_float
  - GL_ARB_texture_gather
  - GL_ARB_texture_mirror_clamp_to_edge
  - GL_ARB_texture_mirrored_repeat
  - GL_ARB_texture_multisample
  - GL_ARB_texture_non_power_of_two
  - GL_ARB_texture_query_levels
  - GL_ARB_texture_query_lod
  - GL_ARB_texture_rectangle
  - GL_ARB_texture_rg
  - GL_ARB_texture_rgb10_a2ui
  - GL_ARB_texture_snorm
  - GL_ARB_texture_stencil8
  - GL_ARB_texture_storage
  - GL_ARB_texture_storage_multisample
  - GL_ARB_texture_swizzle
  - GL_ARB_texture_view
  - GL_ARB_timer_query
  - GL_ARB_transform_feedback2
  - GL_ARB_transform_feedback3
  - GL_ARB_transform_feedback_instanced
  - GL_ARB_transform_feedback_overflow_query
  - GL_ARB_transpose_matrix
  - GL_ARB_uniform_buffer_object
  - GL_ARB_vertex_array_bgra
  - GL_ARB_vertex_array_object
  - GL_ARB_vertex_attrib_64bit
  - GL_ARB_vertex_attrib_binding
  - GL_ARB_vertex_buffer_object
  - GL_ARB_vertex_program
  - GL_ARB_vertex_shader
  - GL_ARB_vertex_type_10f_11f_11f_rev
  - GL_ARB_vertex_type_2_10_10_10_rev
  - GL_ARB_viewport_array
  - GL_ARB_window_pos
  - GL_ATI_draw_buffers
  - GL_ATI_envmap_bumpmap
  - GL_ATI_fragment_shader
  - GL_ATI_separate_stencil
  - GL_ATI_texture_compression_3dc
  - GL_ATI_texture_env_combine3
  - GL_ATI_texture_float
  - GL_ATI_texture_mirror_once
  - GL_EXT_abgr
  - GL_EXT_bgra
  - GL_EXT_bindable_uniform
  - GL_EXT_blend_color
  - GL_EXT_blend_equation_separate
  - GL_EXT_blend_func_separate
  - GL_EXT_blend_minmax
  - GL_EXT_blend_subtract
  - GL_EXT_compiled_vertex_array
  - GL_EXT_copy_buffer
  - GL_EXT_copy_texture
  - GL_EXT_depth_bounds_test
  - GL_EXT_direct_state_access
  - GL_EXT_draw_buffers2
  - GL_EXT_draw_instanced
  - GL_EXT_draw_range_elements
  - GL_EXT_fog_coord
  - GL_EXT_framebuffer_blit
  - GL_EXT_framebuffer_multisample
  - GL_EXT_framebuffer_object
  - GL_EXT_framebuffer_sRGB
  - GL_EXT_geometry_shader4
  - GL_EXT_gpu_program_parameters
  - GL_EXT_gpu_shader4
  - GL_EXT_histogram
  - GL_EXT_memory_object
  - GL_EXT_memory_object_win32
  - GL_EXT_multi_draw_arrays
  - GL_EXT_packed_depth_stencil
  - GL_EXT_packed_float
  - GL_EXT_packed_pixels
  - GL_EXT_pixel_buffer_object
  - GL_EXT_point_parameters
  - GL_EXT_polygon_offset_clamp
  - GL_EXT_provoking_vertex
  - GL_EXT_rescale_normal
  - GL_EXT_secondary_color
  - GL_EXT_semaphore
  - GL_EXT_semaphore_win32
  - GL_EXT_separate_specular_color
  - GL_EXT_shader_image_load_store
  - GL_EXT_shader_integer_mix
  - GL_EXT_shadow_funcs
  - GL_EXT_stencil_wrap
  - GL_EXT_subtexture
  - GL_EXT_texgen_reflection
  - GL_EXT_texture3D
  - GL_EXT_texture_array
  - GL_EXT_texture_buffer_object
  - GL_EXT_texture_compression_bptc
  - GL_EXT_texture_compression_latc
  - GL_EXT_texture_compression_rgtc
  - GL_EXT_texture_compression_s3tc
  - GL_EXT_texture_cube_map
  - GL_EXT_texture_edge_clamp
  - GL_EXT_texture_env_add
  - GL_EXT_texture_env_combine
  - GL_EXT_texture_env_dot3
  - GL_EXT_texture_filter_anisotropic
  - GL_EXT_texture_integer
  - GL_EXT_texture_lod
  - GL_EXT_texture_lod_bias
  - GL_EXT_texture_mirror_clamp
  - GL_EXT_texture_object
  - GL_EXT_texture_rectangle
  - GL_EXT_texture_sRGB
  - GL_EXT_texture_sRGB_R8
  - GL_EXT_texture_sRGB_RG8
  - GL_EXT_texture_sRGB_decode
  - GL_EXT_texture_shared_exponent
  - GL_EXT_texture_snorm
  - GL_EXT_texture_storage
  - GL_EXT_texture_swizzle
  - GL_EXT_timer_query
  - GL_EXT_transform_feedback
  - GL_EXT_vertex_array
  - GL_EXT_vertex_array_bgra
  - GL_EXT_vertex_attrib_64bit
  - GL_IBM_texture_mirrored_repeat
  - GL_KHR_context_flush_control
  - GL_KHR_debug
  - GL_KHR_no_error
  - GL_KHR_parallel_shader_compile
  - GL_KHR_robust_buffer_access_behavior
  - GL_KHR_robustness
  - GL_KTX_buffer_region
  - GL_NV_alpha_to_coverage_dither_control
  - GL_NV_blend_square
  - GL_NV_conditional_render
  - GL_NV_copy_depth_to_color
  - GL_NV_copy_image
  - GL_NV_depth_buffer_float
  - GL_NV_explicit_multisample
  - GL_NV_float_buffer
  - GL_NV_half_float
  - GL_NV_primitive_restart
  - GL_NV_shader_atomic_int64
  - GL_NV_texgen_reflection
  - GL_NV_texture_barrier
  - GL_OES_EGL_image
  - GL_SGIS_generate_mipmap
  - GL_SGIS_texture_edge_clamp
  - GL_SGIS_texture_lod
  - GL_SUN_multi_draw_arrays
  - GL_WIN_swap_hint
  - WGL_EXT_swap_control
  - WGL_ARB_extensions_string
  - WGL_ARB_pixel_format
  - WGL_ATI_pixel_format_float
  - WGL_ARB_pixel_format_float
  - WGL_ARB_multisample
  - WGL_EXT_swap_control_tear
  - WGL_ARB_pbuffer
  - WGL_ARB_render_texture
  - WGL_ARB_make_current_read
  - WGL_EXT_extensions_string
  - WGL_ARB_buffer_region
  - WGL_EXT_framebuffer_sRGB
  - WGL_EXT_colorspace
  - WGL_ATI_render_texture_rectangle
  - WGL_EXT_pixel_format_packed_float
  - WGL_I3D_genlock
  - WGL_NV_swap_group
  - WGL_ARB_create_context
  - WGL_AMD_gpu_association
  - WGL_ARB_create_context_profile
  - WGL_ARB_context_flush_control
  - WGL_NV_DX_interop
  - WGL_ARB_create_context_no_error
  - WGL_NV_DX_interop2
- OpenGL SPIR-V Extensions: 17
  - SPV_AMD_shader_explicit_vertex_parameter
  - SPV_AMD_shader_trinary_minmax
  - SPV_AMD_gcn_shader
  - SPV_ARB_shader_ballot
  - SPV_ARB_shader_ballot
  - SPV_AMD_gpu_shader_half_float
  - SPV_ARB_shader_draw_parameters
  - SPV_ARB_shader_group_vote
  - SPV_AMD_texture_gather_bias_lod
  - SPV_ARB_shader_storage_buffer_object
  - SPV_AMD_gpu_shader_int16
  - SPV_ARB_post_depth_coverage
  - SPV_ARB_shader_atomic_counter_ops
  - SPV_ARB_shader_stencil_export
  - SPV_AMD_shader_stencil_export
  - SPV_AMD_vertex_shader_viewport_index
  - SPV_ARB_shader_image_load_store
- OpenGL core capabilities: 179 caps listed
  - GL_MAX_LIST_NESTING: 64
  - GL_MAX_EVAL_ORDER: 40
  - GL_MAX_LIGHTS: 8
  - GL_MAX_CLIP_PLANES: 8
  - GL_MAX_TEXTURE_SIZE: 16384
  - GL_MAX_PIXEL_MAP_TABLE: 256
  - GL_MAX_ATTRIB_STACK_DEPTH: 16
  - GL_MAX_MODELVIEW_STACK_DEPTH: 32
  - GL_MAX_NAME_STACK_DEPTH: 64
  - GL_MAX_PROJECTION_STACK_DEPTH: 10
  - GL_MAX_TEXTURE_STACK_DEPTH: 10
  - GL_MAX_VIEWPORT_DIMS: 16384
  - GL_MAX_CLIENT_ATTRIB_STACK_DEPTH: 16
  - GL_MAX_3D_TEXTURE_SIZE: 2048
  - GL_MAX_ELEMENTS_VERTICES: 536870911
  - GL_MAX_ELEMENTS_INDICES: 536870911
  - GL_MAX_TEXTURE_UNITS: 8
  - GL_MAX_CUBE_MAP_TEXTURE_SIZE: 16384
  - GL_MAX_TEXTURE_LOD_BIAS: 16
  - GL_MAX_DRAW_BUFFERS: 8
  - GL_MAX_VERTEX_ATTRIBS: 29
  - GL_MAX_TEXTURE_COORDS: 16
  - GL_MAX_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_FRAGMENT_UNIFORM_COMPONENTS: 16384
  - GL_MAX_VERTEX_UNIFORM_COMPONENTS: 16384
  - GL_MAX_VARYING_FLOATS: 128
  - GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS: 160
  - GL_MAX_CLIP_DISTANCES: 8
  - GL_MAX_ARRAY_TEXTURE_LAYERS: 2048
  - GL_MAX_VARYING_COMPONENTS: 128
  - GL_MIN_PROGRAM_TEXEL_OFFSET: -8
  - GL_MAX_PROGRAM_TEXEL_OFFSET: 7
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS: 4
  - GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS: 128
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS: 4
  - GL_MAX_RENDERBUFFER_SIZE: 16384
  - GL_MAX_COLOR_ATTACHMENTS: 8
  - GL_MAX_SAMPLES: 8
  - GL_MIN_PROGRAM_TEXEL_OFFSET_EXT: -8
  - GL_MAX_PROGRAM_TEXEL_OFFSET_EXT: 7
  - GL_RGBA_FLOAT_MODE_ARB: 0
  - GL_MAX_COLOR_ATTACHMENTS_EXT: 8
  - GL_MAX_RENDERBUFFER_SIZE_EXT: 16384
  - GL_MAX_SAMPLES_EXT: 8
  - GL_RGBA_INTEGER_MODE_EXT: 0
  - GL_MAX_ARRAY_TEXTURE_LAYERS_EXT: 2048
  - GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS_EXT: 128
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS_EXT: 4
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS_EXT: 4
  - GL_FRAMEBUFFER_SRGB_CAPABLE_EXT: 1
  - GL_MAX_RECTANGLE_TEXTURE_SIZE: 16384
  - GL_MAX_TEXTURE_BUFFER_SIZE: 268435456
  - GL_MAX_TEXTURE_BUFFER_SIZE_ARB: 268435456
  - GL_MAX_VERTEX_UNIFORM_BLOCKS: 15
  - GL_MAX_GEOMETRY_UNIFORM_BLOCKS: 15
  - GL_MAX_FRAGMENT_UNIFORM_BLOCKS: 15
  - GL_MAX_COMBINED_UNIFORM_BLOCKS: 90
  - GL_MAX_UNIFORM_BUFFER_BINDINGS: 90
  - GL_MAX_UNIFORM_BLOCK_SIZE: 572657868
  - GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS: 2147483389
  - GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS: 2147483389
  - GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS: 2147483389
  - GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT: 4
  - GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_GEOMETRY_UNIFORM_COMPONENTS: 16384
  - GL_MAX_GEOMETRY_OUTPUT_VERTICES: 1023
  - GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS: 4095
  - GL_MAX_VERTEX_OUTPUT_COMPONENTS: 128
  - GL_MAX_GEOMETRY_INPUT_COMPONENTS: 128
  - GL_MAX_GEOMETRY_OUTPUT_COMPONENTS: 128
  - GL_MAX_FRAGMENT_INPUT_COMPONENTS: 128
  - GL_MAX_SERVER_WAIT_TIMEOUT: 2147483647
  - GL_MAX_SAMPLE_MASK_WORDS: 1
  - GL_MAX_COLOR_TEXTURE_SAMPLES: 8
  - GL_MAX_DEPTH_TEXTURE_SAMPLES: 8
  - GL_MAX_INTEGER_SAMPLES: 8
  - GL_PROVOKING_VERTEX: 36430
  - GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION: 1
  - GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS_ARB: 32
  - GL_MAX_GEOMETRY_VARYING_COMPONENTS_ARB: 128
  - GL_MAX_VERTEX_VARYING_COMPONENTS_ARB: 128
  - GL_MAX_GEOMETRY_UNIFORM_COMPONENTS_ARB: 16384
  - GL_MAX_GEOMETRY_OUTPUT_VERTICES_ARB: 1023
  - GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS_ARB: 4095
  - GL_MAX_DUAL_SOURCE_DRAW_BUFFERS: 1
  - GL_FRAGMENT_INTERPOLATION_OFFSET_BITS: 4
  - GL_MIN_SAMPLE_SHADING_VALUE: 0
  - GL_MAX_GEOMETRY_SHADER_INVOCATIONS: 127
  - GL_MIN_FRAGMENT_INTERPOLATION_OFFSET: -1
  - GL_MAX_FRAGMENT_INTERPOLATION_OFFSET: 1
  - GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET: -32
  - GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET: 31
  - GL_MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS: 4
  - GL_MAX_SUBROUTINES: 4096
  - GL_MAX_SUBROUTINE_UNIFORM_LOCATIONS: 4096
  - GL_MAX_PATCH_VERTICES: 32
  - GL_MAX_TESS_GEN_LEVEL: 64
  - GL_MAX_TESS_CONTROL_UNIFORM_COMPONENTS: 16384
  - GL_MAX_TESS_EVALUATION_UNIFORM_COMPONENTS: 16384
  - GL_MAX_TESS_CONTROL_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_TESS_EVALUATION_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_TESS_CONTROL_OUTPUT_COMPONENTS: 128
  - GL_MAX_TESS_PATCH_COMPONENTS: 120
  - GL_MAX_TESS_CONTROL_TOTAL_OUTPUT_COMPONENTS: 4096
  - GL_MAX_TESS_EVALUATION_OUTPUT_COMPONENTS: 128
  - GL_MAX_TESS_CONTROL_UNIFORM_BLOCKS: 15
  - GL_MAX_TESS_EVALUATION_UNIFORM_BLOCKS: 15
  - GL_MAX_TESS_CONTROL_INPUT_COMPONENTS: 128
  - GL_MAX_TESS_EVALUATION_INPUT_COMPONENTS: 128
  - GL_MAX_COMBINED_TESS_CONTROL_UNIFORM_COMPONENTS: 2147483389
  - GL_MAX_COMBINED_TESS_EVALUATION_UNIFORM_COMPONENTS: 2147483389
  - GL_MAX_TRANSFORM_FEEDBACK_BUFFERS: 4
  - GL_MAX_VERTEX_STREAMS: 4
  - GL_NUM_PROGRAM_BINARY_FORMATS: 1
  - GL_MAX_VERTEX_UNIFORM_VECTORS: 4096
  - GL_MAX_VARYING_VECTORS: 32
  - GL_MAX_FRAGMENT_UNIFORM_VECTORS: 4096
  - GL_MAX_VIEWPORTS: 16
  - GL_MAX_TESS_CONTROL_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_TESS_EVALUATION_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_VERTEX_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_COMBINED_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_VERTEX_ATOMIC_COUNTERS: 8
  - GL_MAX_TESS_CONTROL_ATOMIC_COUNTERS: 8
  - GL_MAX_TESS_EVALUATION_ATOMIC_COUNTERS: 8
  - GL_MAX_GEOMETRY_ATOMIC_COUNTERS: 8
  - GL_MAX_FRAGMENT_ATOMIC_COUNTERS: 8
  - GL_MAX_COMBINED_ATOMIC_COUNTERS: 8
  - GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE: 32
  - GL_MAX_ATOMIC_COUNTER_BUFFER_BINDINGS: 8
  - GL_MAX_IMAGE_UNITS: 64
  - GL_MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS: 72
  - GL_MAX_IMAGE_SAMPLES: 8
  - GL_MAX_VERTEX_IMAGE_UNIFORMS: 32
  - GL_MAX_TESS_CONTROL_IMAGE_UNIFORMS: 32
  - GL_MAX_TESS_EVALUATION_IMAGE_UNIFORMS: 32
  - GL_MAX_GEOMETRY_IMAGE_UNIFORMS: 32
  - GL_MAX_FRAGMENT_IMAGE_UNIFORMS: 32
  - GL_MAX_COMBINED_IMAGE_UNIFORMS: 64
  - GL_MIN_MAP_BUFFER_ALIGNMENT: 64
  - GL_UNPACK_COMPRESSED_BLOCK_WIDTH: 0
  - GL_UNPACK_COMPRESSED_BLOCK_HEIGHT: 0
  - GL_UNPACK_COMPRESSED_BLOCK_DEPTH: 0
  - GL_UNPACK_COMPRESSED_BLOCK_SIZE: 0
  - GL_PACK_COMPRESSED_BLOCK_WIDTH: 0
  - GL_PACK_COMPRESSED_BLOCK_HEIGHT: 0
  - GL_PACK_COMPRESSED_BLOCK_DEPTH: 0
  - GL_PACK_COMPRESSED_BLOCK_SIZE: 0
  - GL_MAX_COMPUTE_UNIFORM_BLOCKS: 15
  - GL_MAX_COMPUTE_TEXTURE_IMAGE_UNITS: 32
  - GL_MAX_COMPUTE_IMAGE_UNIFORMS: 32
  - GL_MAX_COMPUTE_SHARED_MEMORY_SIZE: 32768
  - GL_MAX_COMPUTE_UNIFORM_COMPONENTS: 16384
  - GL_MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS: 8
  - GL_MAX_COMPUTE_ATOMIC_COUNTERS: 8
  - GL_MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS: 2147483389
  - GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS: 1024
  - GL_MAX_COMPUTE_WORK_GROUP_COUNT: 65535/65535/65535
  - GL_MAX_COMPUTE_WORK_GROUP_SIZE: 1024/1024/1024
  - GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET: 2047
  - GL_MAX_VERTEX_ATTRIB_BINDINGS: 2047
  - GL_MAX_UNIFORM_LOCATIONS: 4096
  - GL_MAX_FRAMEBUFFER_WIDTH: 16384
  - GL_MAX_FRAMEBUFFER_HEIGHT: 16384
  - GL_MAX_FRAMEBUFFER_LAYERS: 8192
  - GL_MAX_FRAMEBUFFER_SAMPLES: 16
  - GL_MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB: 0
  - GL_MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB: 1024
  - GL_MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB: 0
  - GL_MAX_COMPUTE_FIXED_GROUP_SIZE_ARB: 0
  - GL_MAX_SPARSE_TEXTURE_SIZE_ARB: 16384
  - GL_MAX_SPARSE_3D_TEXTURE_SIZE_ARB: 2048
  - GL_MAX_SPARSE_ARRAY_TEXTURE_LAYERS_ARB: 2048
  - GL_SPARSE_TEXTURE_FULL_ARRAY_CUBE_MIPMAPS_ARB: 1
  - GL_MAX_CULL_DISTANCES: 8
  - GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES: 8
- OpenGL extension capabilities: 163 caps listed
  - GL_RGBA_FLOAT_MODE_ARB: 0 (GL_ARB_color_buffer_float)
  - GL_MAX_COLOR_ATTACHMENTS_EXT: 8 (GL_EXT_framebuffer_object)
  - GL_MAX_RENDERBUFFER_SIZE_EXT: 16384 (GL_EXT_framebuffer_object)
  - GL_MAX_SAMPLES_EXT: 8 (GL_EXT_framebuffer_multisample)
  - GL_RGBA_INTEGER_MODE_EXT: 0 (GL_EXT_texture_integer)
  - GL_MAX_ARRAY_TEXTURE_LAYERS_EXT: 2048 (GL_EXT_texture_array)
  - GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS_EXT: 128 (GL_EXT_transform_feedback)
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS_EXT: 4 (GL_EXT_transform_feedback)
  - GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS_EXT: 4 (GL_EXT_transform_feedback)
  - GL_FRAMEBUFFER_SRGB_CAPABLE_EXT: 1 (GL_EXT_framebuffer_sRGB)
  - GL_MAX_TEXTURE_BUFFER_SIZE_ARB: 268435456 (GL_ARB_texture_buffer_object)
  - GL_MAX_VERTEX_UNIFORM_BLOCKS: 15 (GL_ARB_uniform_buffer_object)
  - GL_MAX_GEOMETRY_UNIFORM_BLOCKS: 15 (GL_ARB_uniform_buffer_object)
  - GL_MAX_FRAGMENT_UNIFORM_BLOCKS: 15 (GL_ARB_uniform_buffer_object)
  - GL_MAX_COMBINED_UNIFORM_BLOCKS: 90 (GL_ARB_uniform_buffer_object)
  - GL_MAX_UNIFORM_BUFFER_BINDINGS: 90 (GL_ARB_uniform_buffer_object)
  - GL_MAX_UNIFORM_BLOCK_SIZE: 572657868 (GL_ARB_uniform_buffer_object)
  - GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_uniform_buffer_object)
  - GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_uniform_buffer_object)
  - GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_uniform_buffer_object)
  - GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT: 4 (GL_ARB_uniform_buffer_object)
  - GL_MAX_RECTANGLE_TEXTURE_SIZE: 16384 (GL_ARB_texture_rectangle)
  - GL_PROVOKING_VERTEX: 36430 (GL_ARB_provoking_vertex)
  - GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION: 1 (GL_ARB_provoking_vertex)
  - GL_MAX_SAMPLE_MASK_WORDS: 1 (GL_ARB_texture_multisample)
  - GL_MAX_COLOR_TEXTURE_SAMPLES: 8 (GL_ARB_texture_multisample)
  - GL_MAX_DEPTH_TEXTURE_SAMPLES: 8 (GL_ARB_texture_multisample)
  - GL_MAX_INTEGER_SAMPLES: 8 (GL_ARB_texture_multisample)
  - GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS_ARB: 32 (GL_ARB_geometry_shader4)
  - GL_MAX_GEOMETRY_VARYING_COMPONENTS_ARB: 128 (GL_ARB_geometry_shader4)
  - GL_MAX_VERTEX_VARYING_COMPONENTS_ARB: 128 (GL_ARB_geometry_shader4)
  - GL_MAX_GEOMETRY_UNIFORM_COMPONENTS_ARB: 16384 (GL_ARB_geometry_shader4)
  - GL_MAX_GEOMETRY_OUTPUT_VERTICES_ARB: 1023 (GL_ARB_geometry_shader4)
  - GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS_ARB: 4095 (GL_ARB_geometry_shader4)
  - GL_MAX_SERVER_WAIT_TIMEOUT: 9223372036854775807 (GL_ARB_sync)
  - GL_MAX_DUAL_SOURCE_DRAW_BUFFERS: 1 (GL_ARB_blend_func_extended)
  - GL_MAX_GEOMETRY_SHADER_INVOCATIONS: 127 (GL_ARB_gpu_shader5)
  - GL_MIN_FRAGMENT_INTERPOLATION_OFFSET: -1 (GL_ARB_gpu_shader5)
  - GL_MAX_FRAGMENT_INTERPOLATION_OFFSET: 1 (GL_ARB_gpu_shader5)
  - GL_FRAGMENT_INTERPOLATION_OFFSET_BITS: 4 (GL_ARB_gpu_shader5)
  - GL_MAX_VERTEX_STREAMS: 4 (GL_ARB_gpu_shader5)
  - GL_MIN_SAMPLE_SHADING_VALUE: 0 (GL_ARB_sample_shading)
  - GL_MAX_SUBROUTINES: 4096 (GL_ARB_shader_subroutine)
  - GL_MAX_SUBROUTINE_UNIFORM_LOCATIONS: 4096 (GL_ARB_shader_subroutine)
  - GL_MAX_TESS_CONTROL_TEXTURE_IMAGE_UNITS: 32 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_EVALUATION_TEXTURE_IMAGE_UNITS: 32 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_GEN_LEVEL: 64 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_CONTROL_UNIFORM_COMPONENTS: 16384 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_EVALUATION_UNIFORM_COMPONENTS: 16384 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_CONTROL_INPUT_COMPONENTS: 128 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_EVALUATION_INPUT_COMPONENTS: 128 (GL_ARB_tessellation_shader)
  - GL_MAX_COMBINED_TESS_CONTROL_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_tessellation_shader)
  - GL_MAX_COMBINED_TESS_EVALUATION_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_tessellation_shader)
  - GL_MAX_PATCH_VERTICES: 32 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_CONTROL_OUTPUT_COMPONENTS: 128 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_PATCH_COMPONENTS: 120 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_CONTROL_TOTAL_OUTPUT_COMPONENTS: 4096 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_EVALUATION_OUTPUT_COMPONENTS: 128 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_CONTROL_UNIFORM_BLOCKS: 15 (GL_ARB_tessellation_shader)
  - GL_MAX_TESS_EVALUATION_UNIFORM_BLOCKS: 15 (GL_ARB_tessellation_shader)
  - GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET: -32 (GL_ARB_texture_gather)
  - GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET: 31 (GL_ARB_texture_gather)
  - GL_MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS: 4 (GL_ARB_texture_gather)
  - GL_MAX_TRANSFORM_FEEDBACK_BUFFERS: 4 (GL_ARB_transform_feedback3)
  - GL_NUM_PROGRAM_BINARY_FORMATS: 1 (GL_ARB_get_program_binary)
  - GL_MAX_VIEWPORTS: 16 (GL_ARB_viewport_array)
  - GL_UNPACK_COMPRESSED_BLOCK_WIDTH: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_UNPACK_COMPRESSED_BLOCK_HEIGHT: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_UNPACK_COMPRESSED_BLOCK_DEPTH: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_UNPACK_COMPRESSED_BLOCK_SIZE: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_PACK_COMPRESSED_BLOCK_WIDTH: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_PACK_COMPRESSED_BLOCK_HEIGHT: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_PACK_COMPRESSED_BLOCK_DEPTH: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_PACK_COMPRESSED_BLOCK_SIZE: 0 (GL_ARB_compressed_texture_pixel_storage)
  - GL_MAX_VERTEX_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_TESS_CONTROL_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_TESS_EVALUATION_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_COMBINED_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_VERTEX_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_TESS_CONTROL_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_TESS_EVALUATION_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_GEOMETRY_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_FRAGMENT_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_COMBINED_ATOMIC_COUNTERS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE: 32 (GL_ARB_shader_atomic_counters)
  - GL_MAX_ATOMIC_COUNTER_BUFFER_BINDINGS: 8 (GL_ARB_shader_atomic_counters)
  - GL_MAX_IMAGE_UNITS: 64 (GL_ARB_shader_image_load_store)
  - GL_MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS: 72 (GL_ARB_shader_image_load_store)
  - GL_MAX_IMAGE_SAMPLES: 8 (GL_ARB_shader_image_load_store)
  - GL_MAX_VERTEX_IMAGE_UNIFORMS: 32 (GL_ARB_shader_image_load_store)
  - GL_MAX_TESS_CONTROL_IMAGE_UNIFORMS: 32 (GL_ARB_shader_image_load_store)
  - GL_MAX_TESS_EVALUATION_IMAGE_UNIFORMS: 32 (GL_ARB_shader_image_load_store)
  - GL_MAX_GEOMETRY_IMAGE_UNIFORMS: 32 (GL_ARB_shader_image_load_store)
  - GL_MAX_FRAGMENT_IMAGE_UNIFORMS: 32 (GL_ARB_shader_image_load_store)
  - GL_MAX_COMBINED_IMAGE_UNIFORMS: 64 (GL_ARB_shader_image_load_store)
  - GL_MIN_MAP_BUFFER_ALIGNMENT: 64 (GL_ARB_map_buffer_alignment)
  - GL_MAX_COMPUTE_UNIFORM_BLOCKS: 15 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_TEXTURE_IMAGE_UNITS: 32 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_IMAGE_UNIFORMS: 32 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_SHARED_MEMORY_SIZE: 32768 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_UNIFORM_COMPONENTS: 16384 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS: 8 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_ATOMIC_COUNTERS: 8 (GL_ARB_compute_shader)
  - GL_MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS: 2147483389 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS: 1024 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_WORK_GROUP_COUNT: 65535/65535/65535 (GL_ARB_compute_shader)
  - GL_MAX_COMPUTE_WORK_GROUP_SIZE: 1024/1024/1024 (GL_ARB_compute_shader)
  - GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET: 2047 (GL_ARB_vertex_attrib_binding)
  - GL_MAX_VERTEX_ATTRIB_BINDINGS: 2047 (GL_ARB_vertex_attrib_binding)
  - GL_MAX_UNIFORM_LOCATIONS: 4096 (GL_ARB_explicit_uniform_location)
  - GL_MAX_FRAMEBUFFER_WIDTH: 16384 (GL_ARB_framebuffer_no_attachments)
  - GL_MAX_FRAMEBUFFER_HEIGHT: 16384 (GL_ARB_framebuffer_no_attachments)
  - GL_MAX_FRAMEBUFFER_LAYERS: 8192 (GL_ARB_framebuffer_no_attachments)
  - GL_MAX_FRAMEBUFFER_SAMPLES: 16 (GL_ARB_framebuffer_no_attachments)
  - GL_MIN_PROGRAM_TEXEL_OFFSET_EXT: -8 (GL_EXT_gpu_shader4)
  - GL_MAX_PROGRAM_TEXEL_OFFSET_EXT: 7 (GL_EXT_gpu_shader4)
  - GL_MAX_TEXTURE_UNITS_ARB: 8 (GL_ARB_multitexture)
  - GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB: 32 (GL_ARB_multitexture)
  - GL_MAX_TEXTURE_IMAGE_UNITS_ARB: 32 (GL_ARB_multitexture)
  - GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB: 16384 (GL_ARB_texture_cube_map)
  - GL_NUM_COMPRESSED_TEXTURE_FORMATS: 18 (GL_ARB_texture_compression)
  - GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT: 16.000000 (GL_EXT_texture_filter_anisotropic)
  - GL_MAX_VERTEX_UNIFORM_COMPONENTS: 16384 (GL_ARB_vertex_shader)
  - GL_MAX_VARYING_FLOATS: 128 (GL_ARB_vertex_shader)
  - GL_MAX_VERTEX_ATTRIBS: 29 (GL_ARB_vertex_shader)
  - GL_MAX_TEXTURE_IMAGE_UNITS: 32 (GL_ARB_vertex_shader)
  - GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS: 32 (GL_ARB_vertex_shader)
  - GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS: 160 (GL_ARB_vertex_shader)
  - GL_MAX_TEXTURE_COORDS: 16 (GL_ARB_vertex_shader)
  - GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB: 16384 (GL_ARB_fragment_shader)
  - GL_MAX_VERTEX_ATTRIBS_ARB: 29 (GL_ARB_vertex_program)
  - GL_MAX_PROGRAM_MATRICES_ARB: 32 (GL_ARB_vertex_program)
  - GL_MAX_PROGRAM_MATRIX_STACK_DEPTH_ARB: 32 (GL_ARB_vertex_program)
  - GL_MAX_TEXTURE_COORDS_ARB: 16 (GL_ARB_fragment_program)
  - GL_MAX_FRAGMENT_UNIFORM_COMPONENTS: 16384 (GL_ARB_shading_language_100)
  - GL_MAX_GEOMETRY_UNIFORM_COMPONENTS_EXT: 16384 (GL_ARB_shading_language_100)
  - GL_MAX_DRAW_BUFFERS_ARB: 8 (GL_ARB_draw_buffers)
  - GL_MAX_COLOR_ATTACHMENTS: 8 (GL_ARB_framebuffer_object)
  - GL_MAX_RENDERBUFFER_SIZE: 16384 (GL_ARB_framebuffer_object)
  - GL_MAX_SAMPLES: 8 (GL_ARB_framebuffer_object)
  - GL_MAX_CONVOLUTION_WIDTH: 0 (GL_ARB_imaging)
  - GL_MAX_CONVOLUTION_HEIGHT: 0 (GL_ARB_imaging)
  - GL_MAX_COLOR_MATRIX_STACK_DEPTH: 10 (GL_ARB_imaging)
  - GL_POINT_SIZE_MIN_ARB: 0.000000 (GL_ARB_point_parameters)
  - GL_POINT_SIZE_MAX_ARB: 8192.000000 (GL_ARB_point_parameters)
  - GL_MAX_VERTEX_UNIFORM_VECTORS: 4096 (GL_ARB_ES2_compatibility)
  - GL_MAX_VARYING_VECTORS: 32 (GL_ARB_ES2_compatibility)
  - GL_MAX_FRAGMENT_UNIFORM_VECTORS: 4096 (GL_ARB_ES2_compatibility)
  - GL_MAX_DEBUG_MESSAGE_LENGTH: 1024 (GL_ARB_debug_output)
  - GL_MAX_DEBUG_LOGGED_MESSAGES_ARB: 256 (GL_ARB_debug_output)
  - GL_MAX_DEBUG_MESSAGE_LENGTH_AMD: 1024 (GL_AMD_debug_output)
  - GL_MAX_DEBUG_LOGGED_MESSAGES_AMD: 256 (GL_AMD_debug_output)
  - GL_MAX_VERTEX_BINDABLE_UNIFORMS_EXT: 15 (GL_EXT_bindable_uniform)
  - GL_MAX_FRAGMENT_BINDABLE_UNIFORMS_EXT: 15 (GL_EXT_bindable_uniform)
  - GL_MAX_GEOMETRY_BINDABLE_UNIFORMS_EXT: 15 (GL_EXT_bindable_uniform)
  - GL_MAX_BINDABLE_UNIFORM_SIZE_EXT: 65536 (GL_EXT_bindable_uniform)
  - GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS_EXT: 32 (GL_EXT_geometry_shader4)
  - GL_MAX_GEOMETRY_OUTPUT_VERTICES_EXT: 1023 (GL_EXT_geometry_shader4)
  - GL_MAX_TEXTURE_BUFFER_SIZE_EXT: 268435456 (GL_EXT_texture_buffer_object)
  - GL_MAX_SAMPLE_MASK_WORDS_NV: 1 (GL_NV_explicit_multisample)

===================================[ Vulkan Capabilities ]
- Instance extensions: 11
  - VK_KHR_device_group_creation (version: 1)
  - VK_KHR_external_fence_capabilities (version: 1)
  - VK_KHR_external_memory_capabilities (version: 1)
  - VK_KHR_external_semaphore_capabilities (version: 1)
  - VK_KHR_get_physical_device_properties2 (version: 2)
  - VK_KHR_get_surface_capabilities2 (version: 1)
  - VK_KHR_surface (version: 25)
  - VK_KHR_win32_surface (version: 6)
  - VK_EXT_debug_report (version: 9)
  - VK_EXT_debug_utils (version: 2)
  - VK_EXT_swapchain_colorspace (version: 4)
- Instance layers: 7
  - VK_LAYER_AMD_switchable_graphics (version: 1.2.170, impl: 1)
  - VK_LAYER_LUNARG_api_dump (version: 1.2.135, impl: 2)
  - VK_LAYER_LUNARG_device_simulation (version: 1.2.135, impl: 1)
  - VK_LAYER_KHRONOS_validation (version: 1.2.135, impl: 1)
  - VK_LAYER_LUNARG_monitor (version: 1.2.135, impl: 1)
  - VK_LAYER_LUNARG_screenshot (version: 1.2.135, impl: 1)
  - VK_LAYER_LUNARG_vktrace (version: 1.2.135, impl: 1)
- Physical devices: 1
  - [Vulkan device 0]: AMD Radeon R7 430 ------------------
    - API version: 1.2.170
    - vendorID: 4098
    - deviceID: 26129
    - driver version: 8388787
  - memory heap count: 3
    - heap1: 1792MB
    - heap2: 7887MB
    - heap3: 256MB
  - memory type count: 4
    - mem type 0 - heap index : 0 - property flag : 1
      > mem property: VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
    - mem type 1 - heap index : 1 - property flag : 6
      > mem property: VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
    - mem type 2 - heap index : 2 - property flag : 7
      > mem property: VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
    - mem type 3 - heap index : 1 - property flag : 14
      > mem property: VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_CACHED_BIT
  - extensions: 106
    - VK_KHR_16bit_storage (version: 1)
    - VK_KHR_8bit_storage (version: 1)
    - VK_KHR_bind_memory2 (version: 1)
    - VK_KHR_buffer_device_address (version: 1)
    - VK_KHR_create_renderpass2 (version: 1)
    - VK_KHR_dedicated_allocation (version: 3)
    - VK_KHR_depth_stencil_resolve (version: 1)
    - VK_KHR_descriptor_update_template (version: 1)
    - VK_KHR_device_group (version: 4)
    - VK_KHR_draw_indirect_count (version: 1)
    - VK_KHR_driver_properties (version: 1)
    - VK_KHR_external_fence (version: 1)
    - VK_KHR_external_fence_win32 (version: 1)
    - VK_KHR_external_memory (version: 1)
    - VK_KHR_external_memory_win32 (version: 1)
    - VK_KHR_external_semaphore (version: 1)
    - VK_KHR_external_semaphore_win32 (version: 1)
    - VK_KHR_get_memory_requirements2 (version: 1)
    - VK_KHR_imageless_framebuffer (version: 1)
    - VK_KHR_image_format_list (version: 1)
    - VK_KHR_maintenance1 (version: 2)
    - VK_KHR_maintenance2 (version: 1)
    - VK_KHR_maintenance3 (version: 1)
    - VK_KHR_multiview (version: 1)
    - VK_KHR_pipeline_executable_properties (version: 1)
    - VK_KHR_relaxed_block_layout (version: 1)
    - VK_KHR_sampler_mirror_clamp_to_edge (version: 3)
    - VK_KHR_sampler_ycbcr_conversion (version: 14)
    - VK_KHR_separate_depth_stencil_layouts (version: 1)
    - VK_KHR_shader_atomic_int64 (version: 1)
    - VK_KHR_shader_clock (version: 1)
    - VK_KHR_shader_draw_parameters (version: 1)
    - VK_KHR_shader_float16_int8 (version: 1)
    - VK_KHR_shader_float_controls (version: 4)
    - VK_KHR_shader_non_semantic_info (version: 1)
    - VK_KHR_shader_subgroup_extended_types (version: 1)
    - VK_KHR_shader_terminate_invocation (version: 1)
    - VK_KHR_spirv_1_4 (version: 1)
    - VK_KHR_storage_buffer_storage_class (version: 1)
    - VK_KHR_swapchain (version: 70)
    - VK_KHR_swapchain_mutable_format (version: 1)
    - VK_KHR_synchronization2 (version: 1)
    - VK_KHR_timeline_semaphore (version: 2)
    - VK_KHR_uniform_buffer_standard_layout (version: 1)
    - VK_KHR_variable_pointers (version: 1)
    - VK_KHR_vulkan_memory_model (version: 3)
    - VK_KHR_win32_keyed_mutex (version: 1)
    - VK_EXT_4444_formats (version: 1)
    - VK_EXT_calibrated_timestamps (version: 1)
    - VK_EXT_depth_clip_enable (version: 1)
    - VK_EXT_depth_range_unrestricted (version: 1)
    - VK_EXT_descriptor_indexing (version: 2)
    - VK_EXT_extended_dynamic_state (version: 1)
    - VK_EXT_external_memory_host (version: 1)
    - VK_EXT_full_screen_exclusive (version: 4)
    - VK_EXT_global_priority (version: 2)
    - VK_EXT_hdr_metadata (version: 2)
    - VK_EXT_host_query_reset (version: 1)
    - VK_EXT_image_robustness (version: 1)
    - VK_EXT_inline_uniform_block (version: 1)
    - VK_EXT_line_rasterization (version: 1)
    - VK_EXT_memory_budget (version: 1)
    - VK_EXT_memory_priority (version: 1)
    - VK_EXT_pipeline_creation_cache_control (version: 3)
    - VK_EXT_pipeline_creation_feedback (version: 1)
    - VK_EXT_private_data (version: 1)
    - VK_EXT_queue_family_foreign (version: 1)
    - VK_EXT_robustness2 (version: 1)
    - VK_EXT_sampler_filter_minmax (version: 2)
    - VK_EXT_sample_locations (version: 1)
    - VK_EXT_scalar_block_layout (version: 1)
    - VK_EXT_separate_stencil_usage (version: 1)
    - VK_EXT_shader_demote_to_helper_invocation (version: 1)
    - VK_EXT_shader_image_atomic_int64 (version: 1)
    - VK_EXT_shader_stencil_export (version: 1)
    - VK_EXT_shader_subgroup_ballot (version: 1)
    - VK_EXT_shader_subgroup_vote (version: 1)
    - VK_EXT_shader_viewport_index_layer (version: 1)
    - VK_EXT_subgroup_size_control (version: 2)
    - VK_EXT_texel_buffer_alignment (version: 1)
    - VK_EXT_tooling_info (version: 1)
    - VK_EXT_transform_feedback (version: 1)
    - VK_EXT_vertex_attribute_divisor (version: 3)
    - VK_AMD_buffer_marker (version: 1)
    - VK_AMD_calibrated_timestamps (version: 1)
    - VK_AMD_display_native_hdr (version: 1)
    - VK_AMD_draw_indirect_count (version: 2)
    - VK_AMD_gcn_shader (version: 1)
    - VK_AMD_gpa_interface (version: 1)
    - VK_AMD_memory_overallocation_behavior (version: 1)
    - VK_AMD_mixed_attachment_samples (version: 1)
    - VK_AMD_negative_viewport_height (version: 1)
    - VK_AMD_pipeline_compiler_control (version: 1)
    - VK_AMD_shader_ballot (version: 1)
    - VK_AMD_shader_core_properties (version: 2)
    - VK_AMD_shader_core_properties2 (version: 1)
    - VK_AMD_shader_explicit_vertex_parameter (version: 1)
    - VK_AMD_shader_fragment_mask (version: 1)
    - VK_AMD_shader_image_load_store_lod (version: 1)
    - VK_AMD_shader_info (version: 1)
    - VK_AMD_shader_trinary_minmax (version: 1)
    - VK_AMD_texture_gather_bias_lod (version: 1)
    - VK_AMD_wave_limits (version: 1)
    - VK_GOOGLE_decorate_string (version: 1)
    - VK_GOOGLE_hlsl_functionality1 (version: 1)
    - VK_GOOGLE_user_type (version: 1)
  - device layers: 1
    - VK_LAYER_AMD_switchable_graphics (version: 1.2.170, impl: 1)
  - device features:
    - robustBufferAccess: true
    - fullDrawIndexUint32: true
    - imageCubeArray: true
    - independentBlend: true
    - geometryShader: true
    - tessellationShader: true
    - sampleRateShading: true
    - dualSrcBlend: true
    - logicOp: true
    - multiDrawIndirect: true
    - drawIndirectFirstInstance: true
    - depthClamp: true
    - depthBiasClamp: true
    - fillModeNonSolid: true
    - depthBounds: true
    - wideLines: true
    - largePoints: true
    - alphaToOne: false
    - multiViewport: true
    - samplerAnisotropy: true
    - textureCompressionETC2: false
    - textureCompressionASTC_LDR: false
    - textureCompressionBC: true
    - occlusionQueryPrecise: true
    - pipelineStatisticsQuery: true
    - vertexPipelineStoresAndAtomics: true
    - fragmentStoresAndAtomics: true
    - shaderTessellationAndGeometryPointSize: true
    - shaderImageGatherExtended: true
    - shaderStorageImageExtendedFormats: true
    - shaderStorageImageMultisample: true
    - shaderStorageImageReadWithoutFormat: true
    - shaderStorageImageWriteWithoutFormat: true
    - shaderUniformBufferArrayDynamicIndexing: true
    - shaderSampledImageArrayDynamicIndexing: true
    - shaderStorageBufferArrayDynamicIndexing: true
    - shaderStorageImageArrayDynamicIndexing: true
    - shaderClipDistance: true
    - shaderCullDistance: true
    - shaderFloat64: true
    - shaderInt64: true
    - shaderInt16: false
    - shaderResourceResidency: true
    - shaderResourceMinLod: true
    - sparseBinding: true
    - sparseResidencyBuffer: true
    - sparseResidencyImage2D: true
    - sparseResidencyImage3D: false
    - sparseResidency2Samples: false
    - sparseResidency4Samples: false
    - sparseResidency8Samples: false
    - sparseResidency16Samples: false
    - sparseResidencyAliased: false
    - variableMultisampleRate: true
    - inheritedQueries: true
  - device limits
    - maxImageDimension1D: 16384
    - maxImageDimension2D: 16384
    - maxImageDimension3D: 2048
    - maxImageDimensionCube: 16384
    - maxImageArrayLayers: 2048
    - maxTexelBufferElements: 4294967295
    - maxUniformBufferRange: 4294967295
    - maxStorageBufferRange: 4294967295
    - maxPushConstantsSize: 128
    - maxMemoryAllocationCount: 4096
    - maxSamplerAllocationCount: 1048576
    - bufferImageGranularity: 1
    - sparseAddressSpaceSize: 1086626725888
    - maxBoundDescriptorSets: 32
    - maxPerStageDescriptorSamplers: 4294967295
    - maxPerStageDescriptorUniformBuffers: 4294967295
    - maxPerStageDescriptorSampledImages: 4294967295
    - maxPerStageDescriptorStorageImages: 4294967295
    - maxPerStageDescriptorInputAttachments: 4294967295
    - maxPerStageResources: 4294967295
    - maxDescriptorSetSamplers: 4294967295
    - maxDescriptorSetUniformBuffers: 4294967295
    - maxDescriptorSetUniformBuffersDynamic: 8
    - maxDescriptorSetStorageBuffers: 4294967295
    - maxDescriptorSetStorageBuffersDynamic: 8
    - maxDescriptorSetSampledImages: 4294967295
    - maxDescriptorSetStorageImages: 4294967295
    - maxDescriptorSetInputAttachments: 4294967295
    - maxVertexInputAttributes: 64
    - maxVertexInputBindings: 32
    - maxVertexInputAttributeOffset: 4294967295
    - maxVertexInputBindingStride: 16383
    - maxVertexOutputComponents: 128
    - maxTessellationGenerationLevel: 64
    - maxTessellationPatchSize: 32
    - maxTessellationControlPerVertexInputComponents: 128
    - maxTessellationControlPerVertexOutputComponents: 128
    - maxTessellationControlPerPatchOutputComponents: 120
    - maxTessellationControlTotalOutputComponents: 4096
    - maxTessellationEvaluationInputComponents: 128
    - maxTessellationEvaluationOutputComponents: 128
    - maxGeometryShaderInvocations: 127
    - maxGeometryInputComponents: 128
    - maxGeometryOutputComponents: 128
    - maxGeometryOutputVertices: 1024
    - maxGeometryTotalOutputComponents: 16384
    - maxFragmentInputComponents: 128
    - maxFragmentOutputAttachments: 8
    - maxFragmentDualSrcAttachments: 1
    - maxFragmentCombinedOutputResources: 4294967295
    - maxComputeSharedMemorySize: 32768
    - maxComputeWorkGroupCount: [65535; 65535; 65535]
    - maxComputeWorkGroupInvocations: 1024
    - maxComputeWorkGroupSize: [1024; 1024; 1024]
    - subPixelPrecisionBits: 8
    - subTexelPrecisionBits: 8
    - mipmapPrecisionBits: 8
    - maxDrawIndexedIndexValue: 4294967295
    - maxDrawIndirectCount: 4294967295
    - maxSamplerLodBias: 15.996094
    - maxSamplerAnisotropy: 16.000000
    - maxViewports: 16
    - maxViewportDimensions: [16384; 16384]
    - viewportBoundsRange: [-32768.000000 ; 32767.000000]
    - viewportSubPixelBits: 8
    - minMemoryMapAlignment: 64
    - minTexelBufferOffsetAlignment: 4
    - minUniformBufferOffsetAlignment: 16
    - minStorageBufferOffsetAlignment: 4
    - minTexelOffset: 4294967232
    - maxTexelOffset: 63
    - minTexelGatherOffset: 4294967264
    - maxTexelGatherOffset: 31
    - minInterpolationOffset: -2.000000
    - maxInterpolationOffset: 1.000000
    - subPixelInterpolationOffsetBits: 8
    - maxFramebufferWidth: 16384
    - maxFramebufferHeight: 16384
    - maxFramebufferLayers: 2048
    - framebufferColorSampleCounts: 15
    - framebufferDepthSampleCounts: 15
    - framebufferStencilSampleCounts: 15
    - framebufferNoAttachmentsSampleCounts: 15
    - maxColorAttachments: 8
    - sampledImageColorSampleCounts: 15
    - sampledImageIntegerSampleCounts: 15
    - sampledImageDepthSampleCounts: 15
    - sampledImageStencilSampleCounts: 15
    - storageImageSampleCounts: 15
    - maxSampleMaskWords: 1
    - timestampComputeAndGraphics: 1
    - timestampPeriod: 37.037037
    - maxClipDistances: 8
    - maxCullDistances: 8
    - maxCombinedClipAndCullDistances: 8
    - discreteQueuePriorities: 2
    - pointSizeRange: [0.000000 ; 8191.875000]
    - lineWidthRange: [0.000000 ; 8191.875000]
    - pointSizeGranularity: 0.125000
    - lineWidthGranularity: 0.125000
    - strictLines: 0
    - standardSampleLocations: 1
    - optimalBufferCopyOffsetAlignment: 1
    - optimalBufferCopyRowPitchAlignment: 1
    - nonCoherentAtomSize: 128

===================================[ OpenCL Capabilities ]
- Num OpenCL platforms: 1
- CL_PLATFORM_NAME: AMD Accelerated Parallel Processing
- CL_PLATFORM_VENDOR: Advanced Micro Devices, Inc.
- CL_PLATFORM_VERSION: OpenCL 2.1 AMD-APP (3240.6)
- CL_PLATFORM_PROFILE: FULL_PROFILE
- Num devices: 1

  - CL_DEVICE_NAME: Oland
  - CL_DEVICE_VENDOR: Advanced Micro Devices, Inc.
  - CL_DRIVER_VERSION: 3240.6
  - CL_DEVICE_PROFILE: FULL_PROFILE
  - CL_DEVICE_VERSION: OpenCL 1.2 AMD-APP (3240.6)
  - CL_DEVICE_TYPE: GPU
  - CL_DEVICE_VENDOR_ID: 0x1002
  - CL_DEVICE_MAX_COMPUTE_UNITS: 6
  - CL_DEVICE_MAX_CLOCK_FREQUENCY: 780MHz
  - CL_DEVICE_ADDRESS_BITS: 32
  - CL_DEVICE_MAX_MEM_ALLOC_SIZE: 1559756KB
  - CL_DEVICE_GLOBAL_MEM_SIZE: 2048MB
  - CL_DEVICE_MAX_PARAMETER_SIZE: 1024
  - CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes
  - CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 16KB
  - CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
  - CL_DEVICE_LOCAL_MEM_TYPE: Local (scratchpad)
  - CL_DEVICE_LOCAL_MEM_SIZE: 32KB
  - CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB
  - CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
  - CL_DEVICE_MAX_WORK_ITEM_SIZES: [1024 ; 1024 ; 1024]
  - CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
  - CL_EXEC_NATIVE_KERNEL: 9806572
  - CL_DEVICE_IMAGE_SUPPORT: YES
  - CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
  - CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
  - CL_DEVICE_IMAGE2D_MAX_WIDTH: 16384
  - CL_DEVICE_IMAGE2D_MAX_HEIGHT: 16384
  - CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048
  - CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048
  - CL_DEVICE_IMAGE3D_MAX_DEPTH: 2048
  - CL_DEVICE_MAX_SAMPLERS: 16
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 4
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 2
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 1
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 1
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 1
  - CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 1
  - CL_DEVICE_EXTENSIONS: 24
  - Extensions:
    - cl_khr_fp64
    - cl_amd_fp64
    - cl_khr_global_int32_base_atomics
    - cl_khr_global_int32_extended_atomics
    - cl_khr_local_int32_base_atomics
    - cl_khr_local_int32_extended_atomics
    - cl_khr_int64_base_atomics
    - cl_khr_int64_extended_atomics
    - cl_khr_3d_image_writes
    - cl_khr_byte_addressable_store
    - cl_khr_gl_sharing
    - cl_amd_device_attribute_query
    - cl_amd_vec3
    - cl_amd_printf
    - cl_amd_media_ops
    - cl_amd_media_ops2
    - cl_amd_popcnt
    - cl_khr_d3d10_sharing
    - cl_khr_d3d11_sharing
    - cl_khr_dx9_media_sharing
    - cl_khr_image2d_from_buffer
    - cl_khr_spir
    - cl_khr_gl_event
    - cl_amd_liquid_flash
janicevidal commented 3 years ago

看下setCacheFile有没有生成cache文件,或者编译的时候开启MNN_OPENCL_PROFILE宏

jnulzl commented 3 years ago

看下setCacheFile有没有生成cache文件,或者编译的时候开启MNN_OPENCL_PROFILE宏

加了MNN_OPENCL_PROFILE后重新编译,MNNForwardType设置为MNN_FORWARD_OPENCL有如下输出,而设为MNN_FORWARD_CPU没有如下输出:

......
kernel cost:76    us Conv2D
kernel cost:25    us Raster0
kernel cost:215    us Raster1
kernel cost:13    us Softmax
kernel cost:11    us Unary
......

看起来MNNForwardType设置为MNN_FORWARD_OPENCL确实跑的是OPENCL,但是速度跟CPU的基本一样。