Open DaveHu-TVU opened 1 year ago
Kernel version 5.15 is too old for 12th Gen. Install the latest linux-firmware and update kernel to 6.1 and try again.
https://github.com/intel/media-driver#known-issues-and-limitations
I had update kernel to 6.1.0-1015-oem and linux-firmware to 20220329 and get the sample result.
I used another cpu(12th Gen Intel(R) Core(TM) i9-12900H) ubuntu 22.04 kernel: 6.1.0-1015-oem and the latest linux-firmware
cmd: /opt/intel/media/share/vpl/samples/_bin/sample_decode h265 -i v3_1080i5994.h265 -o /dev/null -timeout 10000
Just decoding the H265 file encoded by Amba H2(v3_1080i5994.h265) platform will show the error(no encode at the same time) MFX_ERR_DEVICE_FAILED(-17). please see the log mfxlib_Pid2426_Tid140450659272512.log
Decoding started Frame number: 2560, fps: 126.187, fread_fps: 0.000, fwrite_fps: 0.0000 [ERROR], sts=MFX_ERR_DEVICE_FAILED(-17), RunDecoding, DecodeFrameAsync returned error status at /opt/src/vpl-dispatcher_src/tools/legacy/sample_decode/src/pipeline_decode.cpp:1980 Frame number: 2561, fps: 126.234, fread_fps: 0.000, fwrite_fps: 0.000 [ERROR], sts=MFX_ERR_DEVICE_FAILED(-17), RunDecoding, Unexpected error!! at /opt/src/vpl-dispatcher_src/tools/legacy/sample_decode/src/pipeline_decode.cpp:2100 ...
Also I can decode H265 files normally when I use intel msdk encoding(v2_500k_1080i5994.h265) Please compare the difference between these two files for decoding.
No issue with ffmpeg qsv decoder (built with onevpl). I think it should be a sample_decode issue.
ffmpeg -hwaccel qsv -hwaccel_output_format qsv -c:v hevc_qsv -i v3_1080i5994.h265 -f null -
ffmpeg version 6.0-Jellyfin Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 13.1.1 (GCC) 20230429
configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-shared --disable-libxcb --disable-sdl2 --disable-xlib --enable-gpl --enable-version3 --enable-static --enable-gmp --enable-gnutls --enable-chromaprint --enable-libfontconfig --enable-libass --enable-libbluray --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libmp3lame --enable-libopus --enable-libopenmpt --enable-libtheora --enable-libvorbis --enable-libdav1d --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-opencl --enable-vaapi --enable-amf --enable-libvpl --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
[hevc @ 0x557611b8ec80] PPS id out of range: 0
Last message repeated 1 times
[hevc @ 0x557611b8ec80] Error parsing NAL unit #3.
[hevc @ 0x557611b8da00] Stream #0: not enough frames to estimate rate; consider increasing probesize
Input #0, hevc, from 'v3_1080i5994.h265':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: hevc (Main), yuv420p(tv, progressive), 1920x540 [SAR 1:1 DAR 32:9], 59.94 fps, 59.94 tbr, 1200k tbn
libva info: VA-API version 1.19.0
libva info: Trying to open /usr/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_19
libva info: va_openDriver() returns 0
libva info: VA-API version 1.19.0
libva info: Trying to open /usr/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_19
libva info: va_openDriver() returns 0
Stream mapping:
Stream #0:0 -> #0:0 (hevc (hevc_qsv) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[hevc_qsv @ 0x55761217a900] More data is required to decode header
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf60.3.100
Stream #0:0: Video: wrapped_avframe, qsv(tv, top coded first (swapped)), 1920x540 [SAR 1:1 DAR 32:9], q=2-31, 200 kb/s, 59.94 fps, 59.94 tbn
Metadata:
encoder : Lavc60.3.100 wrapped_avframe
frame= 1199 fps=0.0 q=-0.0 Lsize=N/A time=00:00:19.98 bitrate=N/A speed=40.8x 0x
video:562kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Using the bitstream filter hevc_metadata
of ffmpeg can fix your v3 clip.
ffmpeg -i v3_1080i5994.h265 -bsf:v hevc_metadata -c:v copy -y v3_fixed.h265
/usr/bin/vpl-sample_decode h265 -i v3_fixed.h265 -o /dev/null -timeout 10000
Hi @nyanmisaka Thanks for your help. I did some test and have more infomation about this issue. I used the vpl version:2022q2 and found this verion have no this issue, then I used the vpl version:2022Q3 and found this isuue. I contine replace intel-driver to 22.4.4 when use the vpl 2023Q1 and found it can work too. So I think this is a issue in intel-driver after version > 22.4.4. Can you provide a patch to fix it ? Thanks.
Hi @nyanmisaka I saw your libva version is libva info: Found init function __vaDriverInit_1_19 What version of vpl are you using? I think if we use the different libva media-driver will have different result. Thanks
Libva version is not related to this issue. I'm testing the latest tag intel-onevpl-23.3.0. Also I can't test media-driver 22.4.4 since it's too old to support my Arc discrete GPU.
I'm not from intel and probably can't help you fix this. Since the regression seems to be caused by media-driver, you can open a ticket over there.
Hi @nyanmisaka Thanks for your help. I did some test and have more infomation about this issue. I used the vpl version:2022q2 and found this verion have no this issue, then I used the vpl version:2022Q3 and found this isuue. I contine replace intel-driver to 22.4.4 when use the vpl 2023Q1 and found it can work too. So I think this is a issue in intel-driver after version > 22.4.4. Can you provide a patch to fix it ? Thanks.
Hi Dave. What's the test scenario of your above effort? Decode v3_1080i5994.h265 + Encode cnn.yuv? Or just decode test on your i9-12900H platform? The error returned is MFX_ERR_GPU_HANG(-21) or MFX_ERR_DEVICE_FAILED(-17)?
Q1:Decode v3_1080i5994.h265 + Encode cnn.yuv Q2: Both 12900H and 12700 Q3:MFX_ERR_GPU_HANG(-21)
Hi @chenhao5-Intel I have also reproduced the decoding failure MFX_ERR_DEVICE_FAILED(-17) using 2022q2, but it seems to be more difficult to reproduce, I haven't found a stable way to reproduce it yet, I'm working on it. force on 2023Q1 issue first, Thanks
Q1:Decode v3_1080i5994.h265 + Encode cnn.yuv Q2: Both 12900H and 12700 Q3:MFX_ERR_GPU_HANG(-21)
You mean you can reproduce the encode hang issue: "[ERROR], sts=MFX_ERR_GPU_HANG(-21), SynchronizeFirstTask, SyncOperation fail or timeout at /opt/src/vpl-dispatcher_src/tools/legacy/Sample_encode/src/pipeline_encode.cpp:178" on both 12900H and 12700?
Hi @DaveHu-TVU @nyanmisaka I have successfully reproduced this issue on both i7-12700 and i9-12900H + Ubuntu 22.04 env.
There are two issue scenarios: (On both i7-12700 and i9-12900H)
Driver log shows no related errors reported and VPL log shows cm_mem_copy.cpp[Line: 3115]CopyVideoToSys: returns MFX_ERR_GPU_HANG. Analysis WIP.
For encode: _Processing started Frame number: 1600 [ERROR], sts=MFX_ERR_GPU_HANG(-21), SynchronizeFirstTask, SyncOperation fail or timeout at /opt/src/sources/oneVPL-disp/tools/legacy/sample_encode/src/pipeline_encode.cpp:178 [ERROR], sts=MFX_ERR_GPU_HANG(-21), GetFreeTask, m_TaskPool.SynchronizeFirstTask failed at /opt/src/sources/oneVPL-disp/tools/legacy/sample_encode/src/pipeline_encode.cpp:2239 [ERROR], sts=MFX_ERR_GPU_HANG(-21), Run, m_pmfxENC->EncodeFrameAsync failed at /opt/src/sources/oneVPL-disp/tools/legacy/sample_encode/src/pipeline_encode.cpp:2487 [ERROR], sts=MFX_ERR_GPU_HANG(-21), main, pPipeline->Run failed at /opt/src/sources/oneVPL-disp/tools/legacy/sample_encode/src/sampleencode.cpp:1970 Frame number: 1680 Encoding fps: 324
Analyzed log and found that LibVA will report: [LIBVA]:CRITICAL - StatusReport:261: Something unexpected happened in HW, return error to application
As for MFX_ERR_DEVICE_FAILED(-17), it may be a duplicate issue of GPU_HANG. So next step let us focus on decoding v3_1080i5994.h265 scenario first as it may affect the two other issue.
If you have any question, please let me know. Thanks.
BRs, Hao
Hi @chenhao5-Intel We are using VPL2023Q1, so the version I compiled is oneVPL GPU Runtime 2023Q1 Release - 23.1.5 (libmfx-gen.1.2.8) I have reproduced the issue on 12900H with different video formats 1080p5994 1080i5994 720p5994 and put the console log in the attachment. Also I've intercepted the video of the same clip with different encoding and put it in the github comments. The one starting with msdk is generated with media sdk encoding and the one developed by amba is generated with amba h2 encoding. [Uploading msdk_1080p5994.zip…]()
Hi @DaveHu-TVU and all,
We have root-caused this issue. We have updated the codes and will open source it soon.
To check this at your side, please test it on i9-12900H, run "export INTEL_MEDIA_RESET_WATCHDOG=0" first and then run sample app commands. There should be no issues.
For Linux i7-12700, please refer to this known issue: https://community.intel.com/t5/Media-Intel-oneAPI-Video/GPU-hangs-when-decoding-2-HEVC-UHD-streams-444-10-bits-Y410/td-p/1431771
OK, Thanks for your help, @chenhao5-Intel
Which component impacted?
Decode, Encode
Is it regression? Good in old configuration?
Yes, it's good in old version
What happened?
CPU: 12th Gen Intel(R) Core(TM) i7-12700 kernel: Linux tvu-desktop 5.15.0-69-generic #76~20.04.1-Ubuntu SMP Mon Mar 20 15:54:19 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
vpl: 2023Q1(https://github.com/oneapi-src/oneVPL-intel-gpu/releases/tag/intel-onevpl-23.1.5)
Reproduction steps: console1: /opt/intel/media/share/vpl/samples/_bin/sample_decode h265 -i v3_1080i5994.h265 -o /dev/null -timeout 10000
console2:/opt/intel/media/share/vpl/samples/_bin/sample_encode h264 -i cnn.yuv -o /dev/null -w 1920 -h 1080 -timeout 10000 -nv12
[ERROR], sts=MFX_ERR_GPU_HANG(-21), SynchronizeFirstTask, SyncOperation fail or timeout at /opt/src/vpl-dispatcher_src/tools/legacy/Sample_encode/src/pipeline_encode.cpp:178
[ERROR], sts=MFX_ERR_GPU_HANG(-21), GetFreeTask, m_TaskPool.SynchronizeFirstTask failed at /opt/src/vpl-dispatcher_src/tools/legacy/Sample_encode/src/pipeline_encode.cpp:2239
[ERROR], sts=MFX_ERR_GPU_HANG(-21), Run, m_pmfxENC->EncodeFrameAsync failed at /opt/src/vpl-dispatcher_src/tools/legacy/Sample_encode/src/pipeline_encode.cpp:2487
[ERROR], sts=MFX_ERR_GPU_HANG(-21), main, pPipeline->Run failed at /opt/src/vpl-dispatcher_src/tools/legacy/Sample_encode/src/Sample_encode.cpp:1970
What's the usage scenario when you are seeing the problem?
Immersive Media
What impacted?
After testing, we found that: When decoding H264/H265 encoded by intel msdk or vpl and encoding at the same time, it can work; When decoding H265 encoded by our other platform (Amba H2) and encoding at the same time, it is easy to have GPU_HUNG v3_1080i5994.zip
Debug Information
Do you want to contribute a patch to fix the issue?
Yes, I'm glad to submit a patch to fix it