kidrigger / godot-videodecoder

GDNative Video Decoder libraries for Godot Game Engine, using FFmpeg library for codecs. A Google Summer of Code Project, 2018
MIT License
84 stars 22 forks source link

Send YUV textures to godot #37

Open jamie-pate opened 3 years ago

jamie-pate commented 3 years ago

While doing research for https://github.com/kidrigger/godot-videodecoder/issues/36 I found that the YUV->RGB conversion seems like a huge performance bottleneck. It would be nice to be able to send YUV textures and convert them in the fragment shader?

e.g. https://gist.github.com/jamie-pate/c30b49e10238ebcf844fee04c0b2b832

or libyuv which has been discussed here https://github.com/godotengine/godot/issues/5886#issuecomment-235726139

see https://github.com/godotengine/godot/issues/9420

Hardware OS Cmdline hwaccel FPS Speed Video Decode GPU (from taskmanager) Video
i5-5200U Intel HD Graphics 5500 Windows ffmpeg -y -i 8a41a2c3....webm -f rawvideo NUL 234 7.8x
qsv 237 7.91x
-pix_fmt rgb32 -f rawvideo 87 2.9x
-pix_fmt rgb32 -f rawvideo qsv 87 2.9x
-pix_fmt yuv420p -f rawvideo qsv 245 8.18x
-pix_fmt yuv420p -f rawvideo 248 8.26x
-pix_fmt yuv422p -f rawvideo 111 3.71x
jamie-pate commented 3 years ago

https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_YUV_target.txt this extension looks like the ideal way to fix it but it's only available on mobile afaict check for GL_EXT_YUV_target at chrome://gpu

JezSonic commented 3 years ago

Now I'm on my Lenovo Yoga 2 tablet with android 5.0 on it. It's output for chrome://gpu doesn't have GL_EXT_YUV_target included.

It looks like this:


Graphics Feature Status
Canvas: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Disabled
Out-of-process Rasterization: Hardware accelerated
OpenGL: Enabled
Hardware Protected Video Decode: Unavailable
Rasterization: Hardware accelerated
Skia Renderer: Disabled
Surface Control: Disabled
Video Decode: Hardware accelerated
Vulkan: Disabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
Driver Bug Workarounds
clear_uniforms_before_first_program_use
disable_chromium_framebuffer_multisample
dont_delete_source_texture_for_egl_image
dont_disable_webgl_when_compositor_context_lost
max_msaa_sample_count_4
max_texture_size_limit_4096
msaa_is_slow
remove_dynamic_indexing_of_swizzled_vector
scalarize_vec_and_mat_constructor_args
disabled_extension_GL_KHR_blend_equation_advanced
disabled_extension_GL_KHR_blend_equation_advanced_coherent
disabled_webgl_extension_EXT_disjoint_timer_query
disabled_webgl_extension_EXT_disjoint_timer_query_webgl2
Problems Detected
Protected video decoding with swap chain is for certain Intel and AMD GPUs on Windows: 1093625
Disabled Features: protected_video_decode
Clear uniforms before first program use on all platforms: 124764, 349137
Applied Workarounds: clear_uniforms_before_first_program_use
Always rewrite vec/mat constructors to be consistent: 398694
Applied Workarounds: scalarize_vec_and_mat_constructor_args
Multisampling has poor performance in Intel BayTrail: 443517
Applied Workarounds: disable_chromium_framebuffer_multisample
Limit max texure size to 4096 on all of Android
Applied Workarounds: max_texture_size_limit_4096
Disable KHR_blend_equation_advanced until cc shaders are updated: 661715
Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent)
eglSwapBuffers intermittently fails on Android when app goes to background: 744678
Applied Workarounds: dont_disable_webgl_when_compositor_context_lost
On Intel GPUs MSAA performance is not acceptable for GPU rasterization. Duplicate of 132 for Android: 759471
Applied Workarounds: msaa_is_slow
Don't expose disjoint_timer_query extensions to WebGL unless site isolation is enabled: 808744
Limit MSAA to 4x on Android devices: 797243
Applied Workarounds: max_msaa_sample_count_4
Remove dynamic indexing for swizzled vectors on Android: 709351
Applied Workarounds: remove_dynamic_indexing_of_swizzled_vector
Some drivers seem to require as to use original texture whenever possible: 1052114, 1117370
Applied Workarounds: dont_delete_source_texture_for_egl_image
Raster is using a single thread.
Disabled Features: multiple_raster_threads
Surface Control has been disabled by Finch trial or command line.
Disabled Features: surface_control
Version Information
Data exported   2020-12-09T00:43:08.388Z
Chrome version  Chrome/86.0.4240.198
Operating system    Android 5.0.1
Software rendering list URL https://chromium.googlesource.com/chromium/src/+/d8a506935fc2273cfbac5e5b629d74917d9119c7/gpu/config/software_rendering_list.json
Driver bug list URL https://chromium.googlesource.com/chromium/src/+/d8a506935fc2273cfbac5e5b629d74917d9119c7/gpu/config/gpu_driver_bug_list.json
ANGLE commit id fee4fc126724
2D graphics backend Skia/86 b939c288f3d6479d88d1444fafbe7441d11348aa
Command Line    --top-controls-show-threshold=0.27 --top-controls-hide-threshold=0.17 --enable-viewport --validate-input-event-stream --main-frame-resizes-are-orientation-changes --disable-composited-antialiasing --enable-dom-distiller --flag-switches-begin --flag-switches-end
Driver Information
Initialization time 64
In-process GPU  false
Passthrough Command Decoder false
Sandboxed   false
GPU0    VENDOR= 0x0000 [Intel], DEVICE=0x0000 [Intel(R) HD Graphics for BayTrail] *ACTIVE*
Optimus false
AMD switchable  false
Driver vendor   
Driver version  1.0.0
GPU CUDA compute capability major version   0
Pixel shader version    3.10
Vertex shader version   3.10
Max. MSAA samples   4
Machine model name  YOGA Tablet 2-1050L
Machine model version   
GL_VENDOR   Intel
GL_RENDERER Intel(R) HD Graphics for BayTrail
GL_VERSION  OpenGL ES 3.1 - Build 1.0.0-R
GL_EXTENSIONS   GL_EXT_blend_minmax GL_EXT_multi_draw_arrays GL_EXT_texture_filter_anisotropic GL_EXT_texture_compression_s3tc GL_EXT_draw_buffers GL_EXT_color_buffer_float GL_EXT_draw_instanced GL_EXT_texture_rg GL_EXT_texture_buffer GL_INTEL_performance_queries GL_INTEL_performance_query GL_EXT_instanced_arrays GL_INTEL_fragment_shader_ordering GL_EXT_texture_storage GL_KHR_debug GL_OES_EGL_image GL_OES_depth24 GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_depth_texture GL_EXT_color_buffer_half_float GL_OES_vertex_half_float GL_EXT_shadow_samplers GL_OES_standard_derivatives GL_OES_mapbuffer GL_EXT_discard_framebuffer GL_EXT_texture_format_BGRA8888 GL_OES_compressed_paletted_texture GL_OES_EGL_image_external GL_OES_compressed_ETC1_RGB8_texture GL_OES_vertex_array_object GL_OES_get_program_binary GL_OES_texture_3D GL_OES_fbo_render_mipmap GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_element_index_uint GL_OES_texture_npot GL_EXT_sRGB GL_EXT_sRGB_write_control GL_EXT_frag_depth GL_APPLE_texture_max_level GL_EXT_occlusion_query_boolean GL_EXT_texture_compression_dxt1 GL_OES_required_internalformat GL_EXT_separate_shader_objects GL_OES_surfaceless_context GL_OES_EGL_sync GL_EXT_robustness GL_EXT_texture_sRGB_decode GL_EXT_shader_texture_lod GL_EXT_unpack_subimage GL_EXT_read_format_bgra GL_EXT_debug_marker GL_KHR_blend_equation_advanced GL_OES_sample_variables GL_OES_shader_multisample_interpolation GL_OES_texture_stencil8 GL_OES_shader_image_atomic GL_OES_texture_storage_multisample_2d_array GL_INTEL_tessellation GL_INTEL_geometry_shader GL_EXT_shader_integer_mix GL_EXT_disjoint_timer_query
Disabled Extensions GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent
Disabled WebGL Extensions   EXT_disjoint_timer_query EXT_disjoint_timer_query_webgl2
Window system binding vendor    
Window system binding version   
Window system binding extensions    
Direct rendering version    unknown
Reset notification strategy 0x8252
GPU process crash count 0
gfx::BufferFormats supported for allocation and texturing   R_8: not supported, R_16: not supported, RG_88: not supported, BGR_565: not supported, RGBA_4444: not supported, RGBX_8888: not supported, RGBA_8888: not supported, BGRX_8888: not supported, BGRA_1010102: not supported, RGBA_1010102: not supported, BGRA_8888: not supported, RGBA_F16: not supported, YVU_420: not supported, YUV_420_BIPLANAR: not supported, P010: not supported
Compositor Information
Tile Update Mode    One-copy
Partial Raster  Enabled
GpuMemoryBuffers Status
R_8 Software only
R_16    Software only
RG_88   Software only
BGR_565 Software only
RGBA_4444   Software only
RGBX_8888   Software only
RGBA_8888   Software only
BGRX_8888   Software only
BGRA_1010102    Software only
RGBA_1010102    Software only
BGRA_8888   Software only
RGBA_F16    Software only
YVU_420 Software only
YUV_420_BIPLANAR    Software only
P010    Software only
Display(s) Information
Info    Display[0] bounds=[0,0 1280x800], workarea=[0,0 1280x800], scale=1.5, rotation=0, panel_rotation=0 external.
Color space (all)   {primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL}
Buffer format (all) RGBA_8888
SDR white level in nits 100
Bits per color component    8
Bits per pixel  24
Video Acceleration Information
Encode vp8  0x0 to 1280x720 pixels, and/or 30.000 fps
Encode h264 baseline    0x0 to 1280x720 pixels, and/or 30.000 fps
Vulkan Information
Device Performance Information
JezSonic commented 3 years ago

I'll try later on other devices etc