Closed eero-t closed 2 years ago
I've tried stracing syscalls used in both cases, but I still do not fully understand how VPL messes this up. There are no failing file system checks, from which libvpl could deduce that backend is missing:
$ strace -f -e file vpl-inspect
execve("/usr/local/bin/vpl-inspect", ["vpl-inspect"], 0x7ffcf41e8cf8 /* 10 vars */) = 0
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=16809, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/local/lib/libvpl.so.2", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=249400, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2260320, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=125488, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=940560, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=16384, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib64", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib64", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
getcwd("/home/nobody", 4096) = 13
openat(AT_FDCWD, "/home/nobody", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/opt/intel/mediasdk/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/opt/intel/mediasdk/lib64", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}, AT_EMPTY_PATH) = 0
Warning - no implementations found by MFXEnumImplementations()
+++ exited with 0 +++
Whereas with LD_LIBRARY_PATH, it just "magically" decides to try loading the relevant libraries, and finds them (to reduce output, only successfully opened files are shown):
LD_LIBRARY_PATH=/usr/local/lib strace -f -e openat,newfstatat vpl-inspect 2>&1 | grep -v ENOENT|grep -e openat -e newfstat|head -30
openat(AT_FDCWD, "/usr/local/lib/libvpl.so.2", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=249400, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=16809, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2260320, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=125488, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=940560, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/local/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
*** above was not done in previous case
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=16384, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/lib64", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/lib64", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/home/nobody", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_EMPTY_PATH) = 0
*** previously VPL gave up after this ***
openat(AT_FDCWD, "/usr/local/lib/libmfx-gen.so.1.2.7", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=7599376, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/usr/local/lib/libdrm.so.2", O_RDONLY|O_CLOEXEC) = 3
...
Between the failing and successful cases, LD_LIBRARY_PATH use is the only difference.
In the strace output, the only differences are that in latter case:
openat()
+ newfstat()
for /usr/local/lib
/opt/intel/mediasdk/
dirsBecause dynamic linker takes things from cache, I think it's VPL itself doing those directory opens, and open missing for
/usr/local/lib
looks very suspicious.
I found the place where VPL dispatcher actually checks LD_LIBRARY_PATH by itself: https://github.com/oneapi-src/oneVPL/blob/master/dispatcher/vpl/mfx_dispatcher_vpl_loader.cpp#L537
And adds that to some kind of directory search list. If VPL is doing that by itself instead of simply just:
I think it's broken. That kind of "magical" behaviour is both too fragile, and annoying to debug when it fails.
(And it will fail. Mysteriously.)
The current non-standard / manual method for locating shared libraries appears to be encoded in the spec: https://spec.oneapi.io/versions/latest/elements/oneVPL/source/programming_guide/VPL_prg_session.html
Which actually does not list OneVPL installation path as one of the library search paths.
However, it states that for the legacy MSDK (mfx) backends, it would check also ld.so.cache
, and in my case, that does list the relevant libraries. So I think the current implementation is against spec, although the (dispatcher) spec itself is wonky too.
Yes, oneVPL dispatcher searches for runtime libs in the directories listed in the spec - latest version is here. To add locations to the search process, they can be added either to LD_LIBRARY_PATH or ONEVPL_SEARCH_PATH.
The spec should add a clarification that ld.so.cache is not used when searching for legacy (MSDK) runtimes with oneVPL dispatcher. This was noted in a previous update for oneVPL but did not clarify that this applies to legacy RT also. I'll make a request for that in the next spec update.
The reason the behavior is different is because MSDK dispatcher loads a specific library by basename, so the search order is defined by the default behavior for dlopen. oneVPL instead searches a list of directories for all libraries starting with libvpl* and opens any candidate runtime libs by absolute path, which skips ld.so.cache. Requests for changes and improvements to the oneVPL spec can also be filed as in the spec repository here.
@jonrecker Thanks, I filed a spec ticket. IMHO the minimum expected fix is OneVPL build option for specifying the default driver load directory.
Closed since this has been transferred to the spec. Thanks for reporting the issue!
There have been no comments on the spec ticket despite it being open for almost 4 months: https://github.com/oneapi-src/oneAPI-spec/issues/418
It has been one year since the spec bug about obviously broken behavior being specified was filed. Still no comments or fix for it.
Setup
Ubuntu 22.04 container build
All media stack components built with
/usr/local/lib
as their installation destinationDynamic linker see the relevant libraries, and lists them before the system ones:
Use-case
sample_multi_transcode -i::h265 /media/test_yuv420p.h265 -o::h264 /dev/null
vpl-inspect
Actual outcome
Expected outcome
Because all relevant libraries are visible to dynamic linker, and in the same install target directory as libvpl, backend should be found/loaded, like happens when LD_LIBRARY_PATH is pointed to that directory: