Closed damocles-dev closed 2 years ago
Upgrade gmmlib to 22.x
updated gentoo package to gmmlib 22.0.3 and I still get GPU hangs
2022-03-14T17:23:16.273076+01:00 nuc kernel: i915 0000:00:02.0: [drm] Resetting vcs1 for preemption time out
2022-03-14T17:23:16.274055+01:00 nuc kernel: i915 0000:00:02.0: [drm] mpv[146579] context reset due to GPU hang
2022-03-14T17:23:16.276381+01:00 nuc kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:c87dff99, in mpv [146579]
new debug trace:
random hangs happen only with HEVC decoding.
If I play the same video several times, sometimes it works and sometimes it doesn't:
[ffmpeg/video] hevc: nal_unit_type: 0(TRAIL_N), nuh_layer_id: 0, temporal_id: 0
[vo/gpu] pass 'color conversion + output to screen': last 0us avg 0us peak 0us
[vo/gpu] pass 'drawing osd': last 0us avg 0us peak 0us
[osc] rendering
[ffmpeg/video] hevc: Output frame with POC 2.
[ffmpeg/video] hevc: Param buffer (type 0, 604 bytes) is 0.
[ffmpeg/video] hevc: Slice 0 param buffer (264 bytes) is 0x1.
[ffmpeg/video] hevc: Slice 0 data buffer (110 bytes) is 0x2.
[ffmpeg/video] hevc: Decode to surface 0x11.
[ffmpeg/video] hevc: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] hevc: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
[ffmpeg/video] hevc: nal_unit_type: 0(TRAIL_N), nuh_layer_id: 0, temporal_id: 0
[ffmpeg/video] hevc: Output frame with POC 3.
[ffmpeg/video] hevc: Param buffer (type 0, 604 bytes) is 0x2.
[ffmpeg/video] hevc: Slice 0 param buffer (264 bytes) is 0x1.
[ffmpeg/video] hevc: Slice 0 data buffer (110 bytes) is 0.
[ffmpeg/video] hevc: Decode to surface 0x10.
[ffmpeg/video] hevc: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] hevc: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
[ffmpeg/video] hevc: nal_unit_type: 0(TRAIL_N), nuh_layer_id: 0, temporal_id: 0
[ffmpeg/video] hevc: Output frame with POC 4.
[ffmpeg/video] hevc: Param buffer (type 0, 604 bytes) is 0.
[ffmpeg/video] hevc: Slice 0 param buffer (264 bytes) is 0x1.
[ffmpeg/video] hevc: Slice 0 data buffer (110 bytes) is 0x2.
[ffmpeg/video] hevc: Decode to surface 0x11.
[ffmpeg/video] hevc: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] hevc: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
[vo/gpu] Updating global uniform 'texture0'
[vo/gpu] Updating global uniform 'texture1'
[vo/gpu] pass 'color conversion + output to screen': last 0us avg 0us peak 0us
[vo/gpu] pass 'drawing osd': last 0us avg 0us peak 0us
Falling back to software decoding.
vainfo output:
libva info: VA-API version 1.13.0
libva info: User environment variable requested driver 'iHD'
libva info: Trying to open /usr/lib64/va/drivers/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_13
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.13 (libva 2.13.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.1.0 ()
vainfo: Supported profile and entrypoints
VAProfileNone : VAEntrypointVideoProc
VAProfileNone : VAEntrypointStats
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointFEI
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointFEI
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointFEI
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointFEI
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointEncSlice
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileVP9Profile2 : VAEntrypointVLD
I tried media-driver 22.4.0 and it seems that the hang does not occur.
$ vainfo libva info: VA-API version 1.14.0 libva info: User environment variable requested driver 'iHD' libva info: Trying to open /usr/lib64/va/drivers/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_14 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.14 (libva 2.14.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.4.0 (4171a3e4f)
22.4.0? I only get 22.3.0
@Jexu , it is CFL, PCI ID 0x3EA5 please help check it
22.4.0? I only get 22.3.0
Gentoo package manager can download git HEAD code and try to compile it.
Looks like it gets 22.4.0:
-- media -- PLATFORM = linux -- media -- ARCH = 64 -- media -- CMAKE_CURRENT_LIST_DIR = /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999/media_driver -- media -- INCLUDED_LIBS = -- media -- LIB_NAME = iHD_drv_video -- media -- OUTPUT_NAME = -- media -- BUILD_TYPE/UFO_BUILD_TYPE/CMAKE_BUILD_TYPE = release/release/Release -- media -- LIBVA_INSTALL_PATH = -- media -- MEDIA_VERSION = 22.4.0 -- media -- X11 Found
# emerge -v libva-intel-media-driver These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R *] x11-libs/libva-intel-media-driver-9999::gentoo USE="X redistributable -test" 0 KiB Total: 1 package (1 reinstall), Size of downloads: 0 KiB >>> Verifying ebuild manifests >>> Emerging (1 of 1) x11-libs/libva-intel-media-driver-9999::gentoo >>> Unpacking source... * Repository id: intel_media-driver.git * To override fetched repository properties, use: * EGIT_OVERRIDE_REPO_INTEL_MEDIA_DRIVER * EGIT_OVERRIDE_BRANCH_INTEL_MEDIA_DRIVER * EGIT_OVERRIDE_COMMIT_INTEL_MEDIA_DRIVER * EGIT_OVERRIDE_COMMIT_DATE_INTEL_MEDIA_DRIVER * * Fetching https://github.com/intel/media-driver ... git fetch https://github.com/intel/media-driver +HEAD:refs/git-r3/HEAD git symbolic-ref refs/git-r3/x11-libs/libva-intel-media-driver/0/__main__ refs/git-r3/HEAD * Checking out https://github.com/intel/media-driver to /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999 ... git checkout --quiet refs/git-r3/HEAD GIT update --> repository: https://github.com/intel/media-driver at the commit: 4171a3e4fc8399fc3a372ae846ef6e9e25b2ac8d >>> Source unpacked in /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work >>> Preparing source in /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999 ... * Source directory (CMAKE_USE_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999" * Build directory (BUILD_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build" * Applying libva-intel-media-driver-20.2.0_x11_optional.patch ... [ ok ] * Applying libva-intel-media-driver-21.4.2-Remove-unwanted-CFLAGS.patch ... patching file media_driver/cmake/linux/media_compile_flags_linux.cmake Hunk #1 succeeded at 51 with fuzz 1 (offset -1 lines). Hunk #2 succeeded at 193 (offset 8 lines). [ ok ] * Applying libva-intel-media-driver-20.4.5_testing_in_src_test.patch ... [ ok ] * Hardcoded definition(s) removed in media_driver/linux/ult/CMakeLists.txt: * set(CMAKE_BUILD_TYPE "Debug") * set(CMAKE_BUILD_TYPE "ReleaseInternal") * set(CMAKE_BUILD_TYPE "Release") * Hardcoded definition(s) removed in media_driver/CMakeLists.txt: * set(CMAKE_BUILD_TYPE "Release") * set(CMAKE_BUILD_TYPE "ReleaseInternal") * set(CMAKE_BUILD_TYPE "Debug") * Hardcoded definition(s) removed in media_softlet/CMakeLists.txt: * set(CMAKE_BUILD_TYPE "Release") * set(CMAKE_BUILD_TYPE "ReleaseInternal") * set(CMAKE_BUILD_TYPE "Debug") * Hardcoded definition(s) removed in CMakeLists.txt: * set(CMAKE_INSTALL_PREFIX "/usr/local" CACHE PATH "..." FORCE) >>> Source prepared. >>> Configuring source in /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999 ... * Source directory (CMAKE_USE_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999" * Build directory (BUILD_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build" cmake -C /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build/gentoo_common_config.cmake -G Ninja -DCMAKE_INSTALL_PREFIX=/usr -DMEDIA_BUILD_FATAL_WARNINGS=OFF -DMEDIA_RUN_TEST_SUITE=no -DBUILD_TYPE=Release -DPLATFORM=linux -DUSE_X11=yes -DENABLE_NONFREE_KERNELS=yes -DLATEST_CPP_NEEDED=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build/gentoo_toolchain.cmake /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999 loading initial cache file /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build/gentoo_common_config.cmake -- The C compiler identification is GNU 11.2.1 -- The CXX compiler identification is GNU 11.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/x86_64-pc-linux-gnu-gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/x86_64-pc-linux-gnu-g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- CMAKE_INSTALL_PREFIX = /usr CMake Deprecation Warning at cmrtlib/CMakeLists.txt:22 (cmake_minimum_required): Compatibility with CMake < 2.8.12 will be removed from a future version of CMake. Update the VERSION argumentvalue or use a ... suffix to tell CMake that the project does not need compatibility with older versions. CMake Deprecation Warning at cmrtlib/linux/CMakeLists.txt:21 (cmake_minimum_required): Compatibility with CMake < 2.8.12 will be removed from a future version of CMake. Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions. -- Found PkgConfig: /usr/bin/x86_64-pc-linux-gnu-pkg-config (found version "1.8.0") -- Checking for module 'libva>=1.8.0' -- Found libva, version 1.14.0 -- Found X11: /usr/include -- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so -- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so - found -- Looking for gethostbyname -- Looking for gethostbyname - found -- Looking for connect -- Looking for connect - found -- Looking for remove -- Looking for remove - found -- Looking for shmat -- Looking for shmat - found -- Looking for IceConnectionNumber in ICE -- Looking for IceConnectionNumber in ICE - found -- Checking for module 'igdgmm>=12.0.0' -- Found igdgmm, version 12.0.0 -- media -- PLATFORM = linux -- media -- ARCH = 64 -- media -- CMAKE_CURRENT_LIST_DIR = /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999/media_driver -- media -- INCLUDED_LIBS = -- media -- LIB_NAME = iHD_drv_video -- media -- OUTPUT_NAME = -- media -- BUILD_TYPE/UFO_BUILD_TYPE/CMAKE_BUILD_TYPE = release/release/Release -- media -- LIBVA_INSTALL_PATH = -- media -- MEDIA_VERSION = 22.4.0 -- media -- X11 Found -- Checking for module 'libva-x11' -- Found libva-x11, version 1.14.0 -- LIBVA_DRIVERS_PATH = /usr/lib64/va/drivers -- <<< Gentoo configuration >>> Build type Release Install path /usr Compiler flags: C -O2 -pipe -march=native C++ -O2 -pipe -march=native Linker flags: Executable -Wl,-O1 -Wl,--as-needed Module -Wl,-O1 -Wl,--as-needed Shared -Wl,-O1 -Wl,--as-needed -- Configuring done -- Generating done -- Build files have been written to: /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build >>> Source configured. >>> Compiling source in /var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999 ... * Source directory (CMAKE_USE_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999" * Build directory (BUILD_DIR): "/var/tmp/portage/x11-libs/libva-intel-media-driver-9999/work/libva-intel-media-driver-9999_build" ninja -v -j9 -l0 [...]
I can test a specific commit/tag if needed.
22.4.0? I only get 22.3.0
It is the latest commit version(4171a3e4fc8399fc3a372ae846ef6e9e25b2ac8d), please help try it again. Thanks a lot.
These are the results:
Now I can play videos again, thank you very much for fixing it.
Good to know it. So @damocles-dev please helps close this issue and free to re-open it again if having other problem.
HEVC decoding is working now.
System information
Issue behavior
[64304.295273] i915 0000:00:02.0: [drm] Resetting vcs1 for preemption time out [64304.295294] i915 0000:00:02.0: [drm] mpv[89574] context reset due to GPU hang [64304.298461] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:c87dff99, in mpv [89574] [64321.318934] i915 0000:00:02.0: [drm] Resetting vcs1 for preemption time out [64321.318971] i915 0000:00:02.0: [drm] mpv[89632] context reset due to GPU hang [64321.322055] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:c87dff99, in mpv [89632] [64361.254130] i915 0000:00:02.0: [drm] Resetting vcs1 for preemption time out [64361.254166] i915 0000:00:02.0: [drm] mpv[89737] context reset due to GPU hang [64361.257372] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:4:c87dff99, in mpv [89737]
Describe the current behavior
Video playback stops randomly and you need to kill the process to be able to run it again libva-intel-media-driver version 21.x works fine but version 22.x produces random GPU hangs
Describe the expected behavior
Video playback without GPU hangs
Full debug trace:
/sys/class/drm/card0/error
``` GPU HANG: ecode 9:4:c87dff99, in mpv [56786] Kernel: 5.16.8-gentoo-x86_64 x86_64 Driver: 20201103 Time: 1647214334 s 255816 us Boottime: 58296 s 540259 us Uptime: 58292 s 563901 us Capture: 4352963712 jiffies; 26834625 ms ago Active process (on ring vcs1): mpv [56786] Reset count: 0 Suspend count: 0 Platform: COFFEELAKE Subplatform: 0x1 PCI ID: 0x3ea5 PCI Revision: 0x01 PCI Subsystem: 8086:2074 IOMMU enabled?: 1 DMC loaded: yes DMC fw version: 1.4 RPM wakelock: yes PM suspended: no GT awake: yes EIR: 0x00000000 IER: 0x08080000 GTIER[0]: 0x09090909 GTIER[1]: 0x09090909 GTIER[2]: 0x00000000 GTIER[3]: 0x00000909 PGTBL_ER: 0x00000000 FORCEWAKE: 0x00010001 DERRMR: 0x2077efef fence[0] = 00000000 fence[1] = 00000000 fence[2] = 00000000 fence[3] = 286803b02080001 fence[4] = 00000000 fence[5] = 9c3007008c4003 fence[6] = 386803b03080001 fence[7] = 00000000 fence[8] = 00000000 fence[9] = 306803b02880001 fence[10] = 11e803b00a00001 fence[11] = 00000000 fence[12] = 00000000 fence[13] = 00000000 fence[14] = 00000000 fence[15] = 00000000 fence[16] = 00000000 fence[17] = 00000000 fence[18] = 00000000 fence[19] = 00000000 fence[20] = 00000000 fence[21] = 00000000 fence[22] = 00000000 fence[23] = 00000000 fence[24] = 00000000 fence[25] = 00000000 fence[26] = 00000000 fence[27] = 00000000 fence[28] = 00000000 fence[29] = 00000000 fence[30] = 00000000 fence[31] = 00000000 ERROR: 0x00000000 DONE_REG: 0xffffffff FAULT_TLB_DATA: 0x00000019 0xfcfff1fd GTT_CACHE_EN: 0xf0007fff vcs1 command stream: CCID: 0x00000000 START: 0x00096000 HEAD: 0x00000038 [0x00000000] TAIL: 0x000000f0 [0x00000040, 0x00000070] CTL: 0x00003001 MODE: 0x00000000 HWS: 0xfffe0000 ACTHD: 0x00000000 000c114c IPEIR: 0x00000000 IPEHR: 0x73820066 ESR: 0x00000000 INSTDONE: 0xbbffffff batch: [0x00000000_000c1000, 0x00000000_000c9000] BBADDR: 0x00000000_000c114d BB_STATE: 0x00000020 INSTPS: 0x00009020 INSTPM: 0x00000000 FADDR: 0x00000000 000c1300 RC PSMI: 0x00000010 FAULT_REG: 0x00000000 GFX_MODE: 0x00008000 PDP0: 0x0000000109694000 PDP1: 0x0000000000000000 PDP2: 0x0000000000000000 PDP3: 0x0000000000000000 hung: 1 engine reset count: 0 ELSP[0]: pid 56786, seqno 15c9:00000004, prio 0, head 00000078, tail 000000f0 ELSP[1]: pid 0, seqno 7:000224bd, prio 0, head 000004c8, tail 00000510 Active context: mpv[56786] prio 0, guilty 1 active 0, runtime total 0ns, avg 0ns vcs1 --- HW Status = 0x00000000 fffe0000 :cL%-H+92kU^qg*]!PdtP(7RUrf5j=`RY^=FXg]^$*5d37!!W,qDKl-]0PWlU!!!#, vcs1 --- batch = 0x00000000 000c1000 :dd