CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
499 stars 53 forks source link

Error during QSV decoding.: GPU Hang (-21) #301

Closed alatteri closed 1 year ago

alatteri commented 1 year ago

Hello,

When using hevc_qsv there are times where the decoder errors with GPU Hang, for a bit, and then will often fix itself and start playing again. This has been happening the past few days, so it is not related to yesterdays RC mode stuff.

[2023-03-08 12:18:51] [Audio decoder] Volume: [0] -12.04/-0.43, [1] -11.83/-0.26, [2] -14.01/-0.43, [3] -21.09/-6.36, [4] -20.10/-5.05, [5] -19.82/-4.74, [6] -15.15/-1.36, [7] -14.83/-1.45 dBFS RMS/peak
[2023-03-08 12:18:53] [VULKAN_SDL2] 0 frames in 5.04736 seconds = 0 FPS
[2023-03-08 12:18:54] SSRC 0x39f15333: 1792/1792 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:18:56] [Audio decoder] Received 750086/750086 B, decoded 239616 samples in 5.01 sec.
[2023-03-08 12:18:56] [Audio decoder] Volume: [0] -24.07/-10.78, [1] -24.14/-10.93, [2] -34.51/-21.06, [3] -25.77/-13.51, [4] -33.04/-20.17, [5] -34.05/-20.58, [6] -23.12/-9.52, [7] -23.08/-10.52 dBFS RMS/peak
[2023-03-08 12:18:58] [VULKAN_SDL2] 0 frames in 5.04804 seconds = 0 FPS
[2023-03-08 12:18:59] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:01] [Audio decoder] Received 366745/366745 B, decoded 242688 samples in 5.03 sec.
[2023-03-08 12:19:01] [Audio decoder] Volume: [0] -32.26/-13.25, [1] -32.83/-13.79, [2] -43.35/-23.78, [3] -47.12/-29.63, [4] -35.57/-17.18, [5] -36.25/-18.02, [6] -28.76/-10.59, [7] -29.23/-11.73 dBFS RMS/peak
[2023-03-08 12:19:03] [VULKAN_SDL2] 0 frames in 5.04837 seconds = 0 FPS
[2023-03-08 12:19:04] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:06] [Audio decoder] Received 7608/7608 B, decoded 239616 samples in 5.01 sec.
[2023-03-08 12:19:06] [Audio decoder] Volume: [0] -inf/-inf, [1] -inf/-inf, [2] -inf/-inf, [3] -inf/-inf, [4] -inf/-inf, [5] -inf/-inf, [6] -inf/-inf, [7] -inf/-inf dBFS RMS/peak
[2023-03-08 12:19:08] Audio dec stats (cumulative): 50877 played / 50880 total audio frames
[2023-03-08 12:19:08] [VULKAN_SDL2] 0 frames in 5.04716 seconds = 0 FPS
[2023-03-08 12:19:09] SSRC 0x39f15333: 1792/1792 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:11] [Audio decoder] Received 293575/293575 B, decoded 241664 samples in 5.05 sec.
[2023-03-08 12:19:11] [Audio decoder] Volume: [0] -32.43/-15.24, [1] -33.94/-17.67, [2] -30.53/-7.87, [3] -43.87/-19.93, [4] -36.71/-19.93, [5] -39.74/-19.93, [6] -28.82/-11.04, [7] -30.24/-11.85 dBFS RMS/peak
[2023-03-08 12:19:13] [VULKAN_SDL2] 0 frames in 5.04774 seconds = 0 FPS
[2023-03-08 12:19:14] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:15] Your computer may be too SLOW to play this !!!
[2023-03-08 12:19:15] Video dec stats (cumulative): 25500 total / 25417 disp / 83 drop / 12 corr / 0 missing.
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -5): Input/output error.
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -5): Input/output error.
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -5): Input/output error.
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -5): Input/output error.
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -5
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc153540] Error during QSV decoding.: GPU Hang (-21)
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -5
[2023-03-08 12:19:15] Video dec stats (cumulative): 25504 total / 25417 disp / 87 drop / 12 corr / 0 missing.
[2023-03-08 12:19:15] Pbuf: total 56251076/56291584 packets received (99.92804%).
[2023-03-08 12:19:15] [VULKAN_SDL2] Supported codecs are: RGBA RGB UYVY YUYV Y416 R10k RG48 
[2023-03-08 12:19:15] SSRC 0x14ad1d6e: 128/128 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:15] SSRC 0x14ad1d6e: 8832/8832 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:15] SSRC 0x14ad1d6e: 1664/1664 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:15] SSRC 0x14ad1d6e: 1536/1536 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:15] SSRC 0x14ad1d6e: 735/16256 packets received (4.5214%), 15521 lost, max loss 64
[2023-03-08 12:19:15] [video dec.] New incoming video format detected: 3840x2160 @24.00p, codec H.265
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc138640] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] libva info: VA-API version 1.17.0
[2023-03-08 12:19:15] libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[2023-03-08 12:19:15] libva info: Found init function __vaDriverInit_1_17
[2023-03-08 12:19:15] libva info: va_openDriver() returns 0
[2023-03-08 12:19:15] libva info: VA-API version 1.17.0
[2023-03-08 12:19:15] libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[2023-03-08 12:19:15] libva info: Found init function __vaDriverInit_1_17
[2023-03-08 12:19:15] libva info: va_openDriver() returns 0
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:15] [video dec.] Detected compression properties: YCbCr 4:4:4 10 bit
[2023-03-08 12:19:15] [video dec.] Detected compression properties: YCbCr 4:4:4 10 bit
[2023-03-08 12:19:15] [video dec.] Detected compression properties: YCbCr 4:4:4 10 bit
[2023-03-08 12:19:15] [video dec.] Detected compression properties: YCbCr 4:4:4 10 bit
[2023-03-08 12:19:15] [video dec.] Detected compression properties: YCbCr 4:4:4 10 bit
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc138640] A decode call did not consume any data: expect more data at input (-10)
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc138640] A decode call did not consume any data: expect more data at input (-10)
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc138640] A decode call did not consume any data: expect more data at input (-10)
[2023-03-08 12:19:15] [display] Successfully reconfigured display to 3840x2160 @24.00p, codec Y416
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc157c00] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc157c00] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc157c00] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc157c00] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc157c00] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [display] Successfully reconfigured display to 3840x2160 @24.00p, codec Y416
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc16ef80] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc16ef80] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc16ef80] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc16ef80] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc16ef80] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [display] Successfully reconfigured display to 3840x2160 @24.00p, codec Y416
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc178800] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc178800] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc178800] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc178800] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc178800] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [display] Successfully reconfigured display to 3840x2160 @24.00p, codec Y416
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc002e80] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc002e80] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc002e80] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc002e80] video_get_buffer: image parameters invalid
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc002e80] get_buffer() failed
[2023-03-08 12:19:15] [lavc_common] Unexpected return value -22
[2023-03-08 12:19:15] [display] Successfully reconfigured display to 3840x2160 @24.00p, codec Y416
[2023-03-08 12:19:15] [lavc hevc_qsv @ 0x7fc6fc0042c0] Invalid pkt_timebase, passing timestamps as-is.
[2023-03-08 12:19:15] [lavd] Using decoder: hevc_qsv
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Waiting for first HEVC VPS NALU...
[2023-03-08 12:19:15] [lavd] Falling back to software decoding!
[2023-03-08 12:19:15] [lavd] Selected pixel format: nv12
[2023-03-08 12:19:15] libva info: VA-API version 1.17.0
[2023-03-08 12:19:15] libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[2023-03-08 12:19:15] libva info: Found init function __vaDriverInit_1_17
[2023-03-08 12:19:15] libva info: va_openDriver() returns 0
[2023-03-08 12:19:15] libva info: VA-API version 1.17.0
[2023-03-08 12:19:15] libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[2023-03-08 12:19:15] libva info: Found init function __vaDriverInit_1_17
[2023-03-08 12:19:15] libva info: va_openDriver() returns 0
[2023-03-08 12:19:15] [lavd] Hw. acceleration requested but incoming video has not 4:2:0 subsampling, which is usually not supported by hw. accelerators
[2023-03-08 12:19:15] [lavd] Falling back to software decoding!
[2023-03-08 12:19:15] [lavd] Selected pixel format: xv30le
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:15] [lavd]  Error while decoding frame (rc == -11): Resource temporarily unavailable.
[2023-03-08 12:19:16] [Audio decoder] Received 734170/734170 B, decoded 242688 samples in 5.04 sec.
[2023-03-08 12:19:16] [Audio decoder] Volume: [0] -31.61/-16.37, [1] -33.22/-18.16, [2] -22.94/-7.03, [3] -60.25/-44.17, [4] -36.37/-22.10, [5] -40.36/-27.03, [6] -24.18/-10.05, [7] -24.80/-10.12 dBFS RMS/peak
[2023-03-08 12:19:18] [VULKAN_SDL2] 68 frames in 5.02827 seconds = 13.5235 FPS
[2023-03-08 12:19:20] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:20] SSRC 0x14ad1d6e: 9750/9984 packets received (97.6562%), 234 lost, max loss 64
[2023-03-08 12:19:21] [Audio decoder] Received 737712/737712 B, decoded 239616 samples in 5.00 sec.
[2023-03-08 12:19:21] [Audio decoder] Volume: [0] -28.92/-13.55, [1] -29.91/-13.30, [2] -27.18/-8.77, [3] -89.33/-68.81, [4] -31.40/-16.03, [5] -32.54/-15.86, [6] -24.08/-6.83, [7] -24.75/-7.49 dBFS RMS/peak
[2023-03-08 12:19:23] [VULKAN_SDL2] 121 frames in 5.04075 seconds = 24.0044 FPS
[2023-03-08 12:19:25] SSRC 0x39f15333: 1792/1792 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:25] SSRC 0x14ad1d6e: 11776/11776 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:26] [Audio decoder] Received 740084/740084 B, decoded 241664 samples in 5.04 sec.
[2023-03-08 12:19:26] [Audio decoder] Volume: [0] -27.51/-11.82, [1] -28.22/-11.53, [2] -26.93/-9.19, [3] -88.70/-53.28, [4] -30.01/-14.41, [5] -30.87/-13.97, [6] -23.45/-9.59, [7] -24.03/-8.78 dBFS RMS/peak
[2023-03-08 12:19:28] [VULKAN_SDL2] 120 frames in 5.00187 seconds = 23.991 FPS
[2023-03-08 12:19:30] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:30] SSRC 0x14ad1d6e: 5760/5760 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:31] [Audio decoder] Received 784460/784460 B, decoded 240640 samples in 5.00 sec.
[2023-03-08 12:19:31] [Audio decoder] Volume: [0] -19.49/-0.43, [1] -19.90/-0.59, [2] -22.85/-2.31, [3] -32.39/-16.77, [4] -28.49/-13.01, [5] -29.86/-13.19, [6] -20.72/-2.05, [7] -21.15/-2.17 dBFS RMS/peak
[2023-03-08 12:19:33] [VULKAN_SDL2] 121 frames in 5.03956 seconds = 24.01 FPS
[2023-03-08 12:19:35] SSRC 0x39f15333: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-03-08 12:19:35] SSRC 0x14ad1d6e: 5504/5504 packets received (100.0000%), 0 lost, max loss 0
MartinPulec commented 1 year ago

Is this on the nuc with the kernel 6.0? Or on another one running at least Linux kernel 5.17? If not, it is an unsupported configuration by Intel. Vice versa we should try to reproduce that (ideally with FFmpeg directly) and reopen this issue.

alatteri commented 1 year ago

Yep... NUC12 with kernel 6.1.X. It is intermittent.

MartinPulec commented 1 year ago

I've tried to reproduce the issue with the NUC with following command:

uv -t testcard:pattern=text:codec=R10k:size=3840x2160:fps=24 -c libavcodec:encoder=hevc_qsv -d dummy \
   --param force-lavd-decoder=hevc_qsv

with kernel 6.0.19 and HuC enabled and it doesn't seem to be observable in this scenario after few hours. Are there any specific conditions that trigger the behavior, eg. the Vulkan display?

alatteri commented 1 year ago

I feel maybe this is related to non-graceful recovery from packet loss. I was seeing a lot of SRT packet loss when the only RC was CBR. I think with CBR and even a modest bitrate of 20-30M, the GOPs would spike much higher, and then SRT would have issues. I don't believe I've seen GPU Hang since switching to qvbr.