intel / libva

Libva is an implementation for VA-API (Video Acceleration API)
http://intel.github.io/libva/
Other
664 stars 303 forks source link

regression caused by #682 #722

Closed XinfengZhang closed 1 year ago

XinfengZhang commented 1 year ago

there are one regression caused by #682 , current finding: libva info: VA-API version 1.19.0 libva error: vaGetDriverNames() failed with operation failed libva error: va_new_opendriver:715: Out of memory

from app side vaInitialize, err=0x2

xdpyinfo | grep DRI Windows-DRI

it maybe not a libva issue, maybe just this change expose potential issues. I will paste more information after some investigation

evelikov commented 1 year ago

Seems unlikely to be caused by the PR but if you can provide meaningful details I can take a look.

XinfengZhang commented 1 year ago

just have a quick check: the num_drivers is initialized as 20 then after vaStatus = pDisplayContext->vaGetDriverNames(pDisplayContext, drivers, &num_drivers); it is 4 looks some error happened in vaGetDriverNames,

after const char const va_drivers = map[i].va_driver; https://github.com/intel/libva/blob/master/va/drm/va_drm_utils.c#L90

the va_drivers include 4 strings, "iHD" "i965" "pvrsrvkm" "pvr" strange part is map[0].va_driver still only have first two;

XinfengZhang commented 1 year ago

please ignore previous comments, previous debug info looks a env issue.

let re-start the issue debug:

the issue actually is: there are two vaInitialize call, 1st success. 2nd failed with libva error: vaGetDriverNames() failed with operation failed libva error: va_new_opendriver:715: Out of memory

the second time, va_drm_authenticate failed with the ctx->display_type = VA_DISPLAY_DRM drmGetMagic get magic num : magic = 2

so, the call failed , and the num_drivers is still 20 and there are no env set.

after revert #682, 2 initialization both success.

XinfengZhang commented 1 year ago

and before #682, the ctx->display_type also is 48 VA_DISPLAY_DRM but the magic = 0x7fff; the authentication still failed but there are such statement if (vaStatus != VA_STATUS_SUCCESS) { num_candidates = 1; }

then , it still work....

evelikov commented 1 year ago

Not sure I follow - the formatting is all over the place making it almost impossible to understand. Do you have a simple reproducer to share? Ideally we should add it to the libva test suite.

The latter living in libva-utils, which makes zero sense and should be moved, but I digress.

XinfengZhang commented 1 year ago

there are 2 problems 1st. problem: could be reproduced with "vainfo --display x11" there are no DRI2 & DRI3 support, then both va_DRI3_GetDriverNames & va_DRI2_GetDriverNames will failed. then https://github.com/intel/libva/blob/master/va/va.c#L686 , error message shows. but the num_drivers is still 20 then it will failed again in https://github.com/intel/libva/blob/master/va/va.c#L715

I suppose we should set the num_drivers to be zero , if https://github.com/intel/libva/blob/master/va/va.c#L685 failed.

XinfengZhang commented 1 year ago

2nd problem: could be reproduced with "vainfo --display drm" with this patch:

diff --git a/va/drm/va_drm_utils.c b/va/drm/va_drm_utils.c index 5f378d5..7846a6c 100644 --- a/va/drm/va_drm_utils.c +++ b/va/drm/va_drm_utils.c @@ -91,13 +91,18 @@ VA_DRM_GetDriverNames(VADriverContextP ctx, char *drivers, unsigned num_driver

         while (va_drivers[count]) {
             if (count < MAX_NAMES && count < *num_drivers)

using above patch: we could see: driver candidate 0, iHD, iHD driver candidate1 0, iHD, iHD driver candidate 1, i965, i965 driver candidate1 1, i965, i965 driver candidate1 2, (null), pvrsrvkm driver candidate1 3, (null), pvr

a simple fix is move the count++ under the protection of if (count < MAX_NAMES

XinfengZhang commented 1 year ago

about the issue I mentioned in previous comments:

and before #682, the ctx->display_type also is 48 VA_DISPLAY_DRM but the magic = 0x7fff; the authentication still failed but there are such statement if (vaStatus != VA_STATUS_SUCCESS) { num_candidates = 1; }

then , it still work....

it is very hard to reproduce, because it actually about such case: drm authentication failed in va_DisplayContextConnect https://github.com/intel/libva/blob/master/va/drm/va_drm.c#L62

and then initialize failed

but before #682 , even the authentication failed, the code still could run under this protection: if (vaStatus != VA_STATUS_SUCCESS) { num_candidates = 1; }

so, it will still get the first candidate name.

it maybe is a application issue, need to understand why the authentication failed. I guess in below function:

bool va_drm_authenticate(int fd, uint32_t magic) { / XXX: try to authenticate through Wayland, etc. /

ifdef HAVE_VA_X11

if (va_drm_authenticate_x11(fd, magic))
    return true;

endif

/* Default: root + master privs are needed for the following call */
return drmAuthMagic(fd, magic) == 0;

}

suppose it is because of the second vaInitialize.

XinfengZhang commented 1 year ago

a simple sample to reproduce the issue, mentioned here https://github.com/intel/libva/issues/722#issuecomment-1632087963 :

include

include

include <sys/types.h>

include <sys/stat.h>

include

include <va/va.h>

include <va/va_drm.h>

void main() {

int fd0,fd1; VADisplay va_dpy0, va_dpy1; int major_version, minor_version; fd0 = open("/dev/dri/card0", O_RDWR); fd1 = open("/dev/dri/card0", O_RDWR);

va_dpy0 = vaGetDisplayDRM(fd0); vaInitialize(va_dpy0, &major_version, &minor_version);

va_dpy1 = vaGetDisplayDRM(fd1); vaInitialize(va_dpy1, &major_version, &minor_version);

vaTerminate(va_dpy0); vaTerminate(va_dpy1);

close(fd0); close(fd1); }

second vaInitialize will cause drmAuthMagic failure with errno=13

evelikov commented 1 year ago

Thanks for the example. Off the top of my head this behaviour is normal and expected.

In more detail: the first client (regardless if that's libva or Xorg or Weston or...) is always master and authenticated. Since the first client has not a) closed the fd or b) explicitly dropped the master - any other client who tries to issue ioctls that requires master will fail with EACCESS aka 13.

This is part of the DRM fundamentals - trying to relax that this will open a massive security hole. I would strongly recommend using the render node, unless you absolutely need to use the card node.

What program/project is that? If open-source I can send a fix their way.

XinfengZhang commented 1 year ago

the 3rd issue should be expected. I will sync with that app owners but I suppose 1st and 2nd issue still need to be addressed. even these 2 issues does not impact use case.

evelikov commented 1 year ago

1st. problem: could be reproduced with "vainfo --display x11" there are no DRI2 & DRI3 support, then both va_DRI3_GetDriverNames & va_DRI2_GetDriverNames will failed.

What do you mean there are no DRI2 & DRI3 support? Can you provide an example of what hardware/software I need?

evelikov commented 1 year ago

2nd problem: could be reproduced with "vainfo --display drm" with this patch:

The automatic markdown conversion has rendered the patch useless. Can you wrap it in backticks as per the documentation

XinfengZhang commented 1 year ago

please help to review PR #731 as one fix