intel / libva

Libva is an implementation for VA-API (Video Acceleration API)
http://intel.github.io/libva/
Other
666 stars 303 forks source link

2.17 brakes hardware video decoding in Chromium #677

Open gladykov opened 1 year ago

gladykov commented 1 year ago

libva 2.17 breaks hw video decode with or without vulkan on Chromium

See https://bbs.archlinux.org/viewtopic.php?pid=2080054#p2080054

Accoring to author this commit broke it: https://github.com/intel/libva/commit/ef1df02f3ad45ac98b1fa57c373176d7c14dcc57

shtirlic commented 1 year ago

Also here https://bugs.chromium.org/p/chromium/issues/detail?id=1407223#c10

mtak-misc commented 1 year ago

There will be a difference in the output of vainfo in my environment.

If https://github.com/intel/libva/commit/ef1df02f3ad45ac98b1fa57c373176d7c14dcc57 is enabled, "libva error: /usr/lib/dri/iHD_drv_video.so init failed" will be displayed in the following way.

Trying display: wayland
Trying display: x11
libva error: /usr/lib/dri/iHD_drv_video.so init failed
vainfo: VA-API version: 1.17 (libva 2.17.1)
vainfo: Driver version: Intel i965 driver for Intel(R) Sandybridge Mobile - 2.4.1
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

If https://github.com/intel/libva/commit/ef1df02f3ad45ac98b1fa57c373176d7c14dcc57 is disabled, "libva error: /usr/lib/dri/iHD_drv_video.so init failed" will not be displayed in the following way. Even when using libva 2.16.0, this message will not be displayed.

Trying display: wayland
Trying display: x11
vainfo: VA-API version: 1.17 (libva 2.17.1)
vainfo: Driver version: Intel i965 driver for Intel(R) Sandybridge Mobile - 2.4.1
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc
chromer030 commented 1 year ago

Same issue on ArchLinux , tested with Chromium and Google Chrome.

guiodic commented 1 year ago

I can confirm the issue on Arch Linux with TGL Intel iGPU. Reverting the commit all works fine in Chrome/chromium (anyway vainfo has no errors with or without the commit)

rkolbaskin commented 1 year ago

Vivaldi outputs these error messages whenever I try to play any video.

[2273:2273:0122/130638.210519:ERROR:vaapi_wrapper.cc(2684)] vaPutSurface failed, VA error: invalid parameter
[2273:2273:0122/130638.210775:ERROR:vaapi_video_decode_accelerator.cc(286)] Failed putting surface into pixmap

These errors don't show up if I recompile libva with changes to va/x11/va_x11.c made in aforementioned commit undone.

adi8900 commented 1 year ago

For me on Endeavour OS(based on arch anyway) same happens, building package without mentioned commit makes hw video decode working again with chromium based browsers(tested chrome) But i didn't have any errors in vainfo with that commit or without

evgeniy-harchenko commented 1 year ago

Same for Manjaro, tested on Google Chrome, Yandex Browser and Chromium.

mtak-misc commented 1 year ago

Alpine Linux v3.17.1 also has the same problem in Chromium, but not in Firefox. So the problem may not be Linux distribution specific, but web browser specific.

shtirlic commented 1 year ago

Also might be related https://bugs.chromium.org/p/chromium/issues/detail?id=1399897 (Issue 1399897: Vaapi On GL (don't rely on vulkan))

evelikov commented 1 year ago

Wonder why I don't his this issue with neither Firefox nor Chromium on my Arch box. Both --use-gl=egl and --use-gl=desktop work for me... never really tried Vulkan.

Does chromium work without any flags (be that command line or chromium-flags.conf)? Does disabling the sandbox help? I would be unlikely but worth a shot.

evelikov commented 1 year ago

@mtak-misc if you can do some tracing what exactly causes the iHD driver to fail, that would be greatly appreciated. You might need to either a) start a debugger or b) add some printf() in the driver code-base to find you way around.

My gut feeling is that the driver isn't gracefully falling back to DRI2... Considering the iHD complexity, I'll be better if @dvrogozh @XinfengZhang or anyone in the Intel team fix that.

Will also test the old i965 driver tomorrow + open a PR if needed - bear in mind that getting MRs to land in there is like pulling teeth :sweat_smile:

evelikov commented 1 year ago

Based on a quick look - this seems to be a bug in iHD, since it doesn't handle DRI3. It could be fixed by reworking the DeviceName handling although that's quite a lot of work - should have a PR by the end of the week.

tmn505 commented 1 year ago

This bug doesn't affect only Chromium and iHD driver. I tested Xine player on CherryView (i965) and it also fails to decode when launched with -V vaapi. Reverting the mentioned commit fixes the issue. When tested with mpv everything is ok with or without revert, even when running with --hwdec=vaapi.

evelikov commented 1 year ago

@tmn505 off the top of my head - does Xine use vaPutSurface by any chance?

Edit: Should have looked at the linked bug I guess - yes it does. Xine should probably migrate to using vlVaExportSurfaceHandle instead. Either that or one needs to add support for iHD and i965 - Mesa had a code path for 3+ years.

tmn505 commented 1 year ago

I think yes, since grepping the code mentions it:

src/video_out/vaapi/xine_va_display.h:#define XINE_VA_DISPLAY_X11    0x0002  /* Require X11 interop (vaPutSurface)     */
src/video_out/video_out_vaapi.c:        vaStatus = vaPutSurface(va_context->va_display, va_surface_id, this->gl_image_pixmap,
src/video_out/video_out_vaapi.c:        msg = "vaPutSurface()";
src/video_out/video_out_vaapi.c:      vaStatus = vaPutSurface(va_context->va_display, va_surface_id, this->window,
src/video_out/video_out_vaapi.c:      if(!vaapi_check_status(this, vaStatus, "vaPutSurface()"))

I would need to try xine verbose logging to be sure (don't know if that'll actually say something), can't do it right now will do that tomorrow. Aside that, the project https://github.com/ua0lnj/vdr-plugin-softhddevice/issues/52, definitely uses it.

XinfengZhang commented 1 year ago

vaPutSurface is used to copy surfaces to x11, there should be different implementation for DRI2 and DRI3. but obviously, media_driver now just support DRI2. but looks it break current application implementation , yes, implement it in media_driver maybe just could resolve part of the issue only for intel platform, how about a patch inside libva to wrap the dri_vtable by different implementation , such as detect dri version and check the version in va_dri_get_drawable, va_dri_get_rendering_buffer to have different code path/implementation? then backend driver will call these function to get buffer, and copy the surface data into it?

tmn505 commented 1 year ago

@evelikov to follow-up on yesterdays request here's the log from xine with verbose output: xine-1675267494.txt and it indeed uses vaPutSurface (video_out_vaapi Error : vaPutSurface(): unknown libva error).

Also to clear small misunderstanding, the project I linked in this issue is unrelated to Xine in any way. First I spotted the issue with lack of video output in ua0lnj/vdr-plugin-softhddevice (it uses ffmpeg), the to test out if it occurs elsewhere, I took Xine player (for simplicity of test case).

evelikov commented 1 year ago

@XinfengZhang looks like a driver specific hack, but doable - will open a PR tomorrow. The more important question is how long will it take to release fixed iHD and i965 drivers which honours that logic. The issues and PR section of latter look like a graveyard :cry:

XinfengZhang commented 1 year ago

@evelikov , thanks , I agree that this is something like a WA, if backend driver does not call va_dri_get_rendering_buffer , implement these by itself, it certainly still failed. but at least , we avoid to fix in i965 and iHD :)

actually, I dont like vaPutSurfaces, I like to retrieve surface fd by vaExportSurfaceHandle, then import to display ...

evelikov commented 1 year ago

@XinfengZhang your suggestion doesn't quite work - so I suggest merging https://github.com/intel/libva/pull/679 and rolling 2.17.1 ASAP.

In particular: your suggestion boils down to the driver init function error-ing out. Thus libva frontend falling back to DRI2. That is currently not possible with libva due to the way DRI2/DRI3 initialisation is handled and driver names are retrieved. When init fails for a given driver (name), we try another one, we do not re-check for DRi2.

I have a WIP to fix that, but it's a lot of code churn and not something suitable for a bugfix release.

adi8900 commented 1 year ago

yeah works here, just i messed up and i set env flag in wrong file... meh

evelikov commented 1 year ago

We've done all we can on libva side. I would suggest opening a bug with the problematic driver(s) and helping the devs add DRI3 support.

gladykov commented 1 year ago

I can confirm it works with export LIBVA_DRI3_DISABLE=1 , intel_media_driver 23.1.0 and libva 2.18 using Opera with Chromium 111.0.5536.111

XinfengZhang commented 1 year ago

@evelikov , maybe, we should try DRI2 firstly, then DRI3, it will resolve such problem.

koshikas commented 1 year ago

We've done all we can on libva side. I would suggest opening a bug with the problematic driver(s) and helping the devs add DRI3 support.

i might be able to help if you can tell me what you want done. my setup; vainfo; Trying display: x11 vainfo: VA-API version: 1.18 (libva 2.17.1) vainfo: Driver version: Intel i965 driver for Intel(R) Ivybridge Mobile - 2.4.1

Graphics: Device-1: Intel 3rd Gen Core processor Graphics vendor: Dell driver: i915 v: kernel arch: Gen-7 process: Intel 22nm built: 2012-13 ports: active: LVDS-1 empty: DP-1,HDMI-A-1,VGA-1 bus-ID: 00:02.0 chip-ID: 8086:0166 class-ID: 0300 Display: x11 server: X.Org v: 21.1.8 compositor: kwin_x11 driver: X: loaded: modesetting,radeon alternate: fbdev,vesa dri: crocus,r600 gpu: i915 display-ID: :0 screens: 1 API: OpenGL v: 4.2 Mesa 23.0.1 renderer: Mesa Intel HD Graphics 4000 (IVB GT2) direct-render: Yes

VAAPI works alright in stock state in media players, but fails in chrome with errors; [26733:26733:0221/191510.033476:ERROR:vaapi_wrapper.cc(2699)] : vaPutSurface failed, VA error: unknown libva error [26733:26733:0221/191510.033613:ERROR:vaapi_video_decode_accelerator.cc(286)] : Failed putting surface into pixmap

resumes functionality in chrome, after setting ENV LIBVA_DRI3_DISABLE=1 (since libva 2.18)

evelikov commented 1 year ago

@koshikas you want to prod the Intel developers to add DRI3 support to the driver. Which seem borderline abandoned ... even trivial fixes are waiting for months with zero feedback :frowning_face:

evelikov commented 1 year ago

@XinfengZhang you really do not want to do that for DRI3 capable devices.

XinfengZhang commented 1 year ago
  1. what is the DRI3 enabling patch purpose: I think it is used to resolve the problem that could not get driver name though DRI2 interfaces, if the env just support DRI3, not support DRI2 . such as X-Wayland.
  2. what's current problem: has no implementation of vaPutSurfaces basing on DRI3.

from this perspective, trying DRI2 firstly could cover both senario. for DRI3 only env, it could detect the driver though DRI3 interfaces. for DRI2 & DRI3 env, it could try DRI2 firstly, vaPutSurfaces works.

terr72 commented 1 year ago
2. what's current problem: has no implementation of vaPutSurfaces basing on DRI3.

I'm experiencing similar problems with xine using libva2 (2.14.0-1, Ubuntu Jammy) and iHD 22.3.1 drivers. I get video_out_vaapi Error : vaPutSurface(): unknown libva error as verbose output. Desktop is Xfce, so no Wayland, just X11.

Do I understand this right:

Or would my problem persist in libva2 >= 2.18 since the iHD driver with missing vaPutSurface() support is the actual reason for the problem?

XinfengZhang commented 1 year ago
2. what's current problem: has no implementation of vaPutSurfaces basing on DRI3.

I'm experiencing similar problems with xine using libva2 (2.14.0-1, Ubuntu Jammy) and iHD 22.3.1 drivers. I get video_out_vaapi Error : vaPutSurface(): unknown libva error as verbose output. Desktop is Xfce, so no Wayland, just X11.

Do I understand this right:

  • libva2 versions up to 2.17 have no possiblity to disable DRI3 at all in order to fall back to DRI2 and therefore put out this errors?
  • libva2 versions >= 2.18 can fall back to DRI2 via LIBVA_DRI3_DISABLE=1 to get vaPutSurface()-support?

these two is correct.

also, it could be fixed by #716 , and https://github.com/intel/libva/pull/716#issuecomment-1623002713 for 2.18

cfoch commented 10 months ago

vaPutSurface is used to copy surfaces to x11, there should be different implementation for DRI2 and DRI3. but obviously, media_driver now just support DRI2. but looks it break current application implementation , yes, implement it in media_driver maybe just could resolve part of the issue only for intel platform, how about a patch inside libva to wrap the dri_vtable by different implementation , such as detect dri version and check the version in va_dri_get_drawable, va_dri_get_rendering_buffer to have different code path/implementation? then backend driver will call these function to get buffer, and copy the surface data into it?

Why isn't there an analog function like dri2CreateDrawable for dri3? How can it be implemented?