kidrigger / godot-videodecoder

GDNative Video Decoder libraries for Godot Game Engine, using FFmpeg library for codecs. A Google Summer of Code Project, 2018
MIT License
84 stars 22 forks source link

Hardware Accelerated decoding #36

Open jamie-pate opened 3 years ago

jamie-pate commented 3 years ago

I'd love to add this but it seems like unless we can figure out a 0-copy mechanism to get the texture into Godot using a native format this would not improve performance appreciably. Only QSV decoding seemed to have a small benefit when dumping raw frames to a null device (NUL, /dev/null) on slower cpus. Please contribute any findings to this thread if you can find a case where rawvideo is improved by hardware device decoding.

Additionally, should test with -pix_fmt rgb32 before -f rawvideo as that would perform the rgb32 conversion required to show the frames in godot.

The last result in this table shows the expected result, 173fps vs 169fps and reduced the cpu load from 37% to 33%

Hardware OS Cmdline hwaccel FPS Speed Video Decode GPU (from taskmanager) Video
i7-8750H GeForce GTX 1070 with Max-Q Design/PCIe/SSE2 Windows ffmpeg -y -i out9-fhd.webm -f rawvideo NUL   743 29.7x   Video: vp9 (Profile 0), yuv420p(tv, progressive), 1920x1080, SAR 1:1 DAR 16:9, 25 fps 250kb/s 49.60s
    ffmpeg -y -hwaccel cuda -i out9-fhd.webm -f rawvideo NUL  cuda 342 13.7x nvidia  
      dxva2 269 10.8x Intel  
      qsv 582 23.3x    
      d3d11va 167 6x    
i7-8650U UHD Graphics 620 (KBL GT2) Linux     578 23.1x    
      vdpau 514 20.6x    
      vaapi 45 1.8    
      drm     Device creation failed: -14.  
      opnecl 588 23.5x    
      cuvid     need nvidia  
      cuda     need nvidia  
  Windows     409 16.3x    
      cuda     need nvidia  
      dxva2 189 7.5x Intel  
      qsv 406 16.3x    
      d3d11va 99 3.94x Intel  
      opencl 414 16.6x    
      vulkan 405 16.2x    
    8a41a2c3....webm          
        397 13.2x    
      dxva2 209 6.95x    
      qsv 430 14.3x    
      d3d11va 100 3.32x    
      opencl 382 12.7x    
      vulkan 411 13.7x    
-pix_fmt rgb32 -f rawvideo 169 5.64x
qsv 173 5.76x
i7-8750H GeForce GTX 1070 with Max-Q Design/PCIe/SSE2 Windows 8a41a2c3....webm         vp9 (Profile 0), yuv420p(tv), 1920x1080, SAR 229:252 DAR 916:567, 30 fps 68 kb/s 14m 15.94s
        729 24.2x    
      d3d11va 142 4.75x    
      qsv 769 25.6x    
      dxva2 352 11.7x    
i5-5200U Intel HD Graphics 5500 Windows ffmpeg -y -i 8a41a2c3....webm -f rawvideo NUL 234 7.8x
qsv 237 7.91x
-pix_fmt rgb32 -f rawvideo 87 2.9x
-pix_fmt rgb32 -f rawvideo qsv 87 2.9x
-pix_fmt yuv420p -f rawvideo qsv 245 8.18x
-pix_fmt yuv420p -f rawvideo 248 8.26x
-pix_fmt yuv422p -f rawvideo 111 3.71x

windows build was https://github.com/BtbN/FFmpeg-Builds/releases 2020-11-23 (shared-vulkan when possible)

jamie-pate commented 3 years ago

out9-fhd.zip

jamie-pate commented 3 years ago

one possible option is to copy the texture using EGL or similar? http://wiki.100ask.org/EGL_texture_0-copy

mpv player looks like a good example using ffmpeg

https://github.com/mpv-player/mpv/blob/80c4aaa2a4e7ada6530ad4f16172283cd82fcc1d/libmpv/render_gl.h#L133 Seems like mpv can render directly to hw contexts https://github.com/mpv-player/mpv/blob/802f594a857c703ac88e946d14b69cd3b6eb6006/video/out/opengl/hwdec_dxva2egl.c#L320 https://github.com/mpv-player/mpv/blob/172146e9f7a231b5de21921d883612d18b13a717/video/decode/vd_lavc.c

Something something framebuffers https://learnopengl.com/Advanced-OpenGL/Framebuffers

jamie-pate commented 3 years ago

For linux, looks like we'd need both of these: https://wiki.debian.org/HardwareVideoAcceleration (prefer VA-API except with NVIDIA proprietary drivers)

VA-API - Supported on Intel, AMD, and NVIDIA (only via the open-source Nouveau drivers). Widely supported by software, including Kodi, VLC, MPV, Chromium, and Firefox. Main limitation is lacking any support in the proprietary NVIDIA drivers.

VDPAU - Supported fully on AMD and NVIDIA (both proprietary and Nouveau). Supported by most desktop applications like Kodi, VLC, and MPV, but has no support at all in Chromium or Firefox. Main limitations are poor and incomplete Intel support and not working with browsers for web video acceleration.

jamie-pate commented 3 years ago

Considerations with libva which includes binary blobs (shouldn't be an issue as long as we avoid GPL licensing) https://github.com/intel/libva/issues/118