Closed jvrobert closed 2 years ago
(Thankfully we're well past the dark days of OpenCL 1.0)
You mean because of surface sharing vendor extensions?
Why "falling back on QSV"?
QSV seem's to perform ~7% worse than the VAAPI codepath for hevc decoding for example, but seems to be unaffected by the GPU crash that is causing the initial crash I encountered here.
You'd lose some performance, but it would allow for some hardware acceleration via the dedicated quicksync hardware blocks while still being reliable.
(Thankfully we're well past the dark days of OpenCL 1.0)
You mean because of surface sharing vendor extensions?
More so a comment on how long it took for any of the major hardware vendors to get serious about media support on linux, specifically open source support back in the day
get around the lack of HDR support in the AMDGPU vaapi driver
They are doing so little to get better AMD support into ffmpeg...we just have minimal support for these..
In the sense of AMD putting in work or FFMPEG putting in work?
it's a little off topic, but ffmpeg supports AMF as much as possible (see https://ffmpeg.org/general.html#toc-AMD-AMF_002fVCE). The reason for much better support for intel chips has been the amount of work put in by intel engineers on the mailing lists, same as the work they put into the kernel for example
QSV seem's to perform ~7% worse than the VAAPI codepath for hevc decoding for example, but seems to be unaffected by the GPU crash that is causing the initial crash I encountered here.
Oh, I totally forgot to ask about trying QSV, I thought I had..
n the sense of AMD putting in work or FFMPEG putting in work?
it's a little off topic, but ffmpeg supports AMF as much as possible (see ffmpeg.org/general.html#toc-AMD-AMF_002fVCE). The reason for much better support for intel chips has been the amount of work put in by intel engineers on the mailing lists, same as the work they put into the kernel for example
That's what I meant - there's not much effort taken when comparing, no filtering and no decoders. But anyway, that's really OT..
Back to Intel, we're supporting for many years on Linux and Windows, and I thought I know all and everything, but until few months ago I had never heard about GuC and HuC. So that whole situation appears a bit like a bad dream.. ;-)
QSV seem's to perform ~7% worse than the VAAPI codepath for hevc decoding for example, but seems to be unaffected by the GPU crash that is causing the initial crash I encountered here.
Oh, I totally forgot to ask about trying QSV, I thought I had..
User said he had the same symptom with QSV (encoding and decoding) Unfortunately he also said that he's got enough of this and will return the device to the store.
User said he had the same symptom with QSV (encoding and decoding) Unfortunately he also said that he's got enough of this and will return the device to the store.
it seems I've been able to recreate the QSV behaviour on 5.18 rc-1 as well in very specific scenarios.
while running
ffmpeg -hwaccel qsv -c:v hevc_qsv -i hdr_source.mp4 -vf 'vpp_qsv=framerate=60,scale_qsv=w=1920:h=1080:format=nv12' -c:v h264_qsv output/output/sdr_out.mp4 -y
if the command is interrupted the GPU returns to an unrecoverable state
command is interrupted
q or sigterm?
q or sigterm?
I did it by accident, need to reboot the machine now and double check 😅
Edit: it's neither. ~15-20 seconds into transcoding I received
[Parsed_vpp_qsv_0 @ 0x23ecc00] Error running VPP: unknown error (-21)890.5kbits/s speed=2.08x
Error while filtering: Unknown error occurred
Failed to inject frame into filter network: Unknown error occurred
Error while processing the decoded data for stream #0:0
and the application did not exit.
it then needs to be hard exited via 3 system signals.
MFX_ERR_GPU_HANG = -21, /* device operation failure caused by GPU hang */
Have you ever tried what happens when you disable HuC and GuC?
Have you ever tried what happens when you disable HuC and GuC?
Without the firmware loadout the iGPU's would not be available at all. They are loaded in by the kernel as a driver via i915.
further work:
simplifying the command to ffmpeg -hwaccel qsv -c:v hevc_qsv -hwaccel_output_format qsv -i hdr_source.mp4 -vf 'scale_qsv=w=1920:h=1080:format=nv12' -c:v h264_qsv output/output/sdr_out.mp4 -y
Crashes in ~ 3-4 seconds, compared to about 15-20 before.
Without the firmware loadout the iGPU's would not be available at all. They are loaded in by the kernel as a driver via i915.
I mean to set i915.enable_guc=0
It's not a requirement - at least not on TGL and below.
I had realized that I have a notebook here with the exact same graphics as the other user (TGL). And it just works fine.
The one thing that's special about it: I haven't updated the OS or OS components for more than a year. GuC and HuC are off
I had realized that I have a notebook here with the exact same graphics as the other user (TGL). And it just works fine.
Can you double check that it is using the iGPU? Also can you check using the commands above and see how your chip reacts?
RE GuC/HuC
Interesting. from the documentation HuC and GuC support seem needed for some features. I'll try it now.
HuC is mandatory for JSL and EHL, because these two don't support "Es" encoding but only "E" (low power..)
Es (Hardware(PAK) + Shader(media kernel+VME) Encoding) is not supported on JSL & EHL, only E (Hardware Encoding, Low Power Encoding(VDEnc/Huc)) is supported, however E depends on GuC / HuC firmwares.
https://github.com/intel/media-driver#decodingencoding-features
Can you double check that it is using the iGPU? Also can you check using the commands above and see how your chip reacts?
I don't need to check, because it still shows a tonemap_vaapi bug that I had reported a year ago, It's fixed, but I never updated the machine, so it's still showing ;-)
Also:
Could you load an ISO on a usb drive perhaps? as a way to do A B testing of some sort?
I don't get that - why?
I don't get that - why?
if it is a question of version, bisecting where the problem began could be a way to accelerate tracking down and closing this issue (where the regression was introduced etc.)
edit:
for example, what is the first operating system/software version that can and cannot run
ffmpeg -hwaccel qsv -c:v hevc_qsv -i hdr_source.mp4 -vf 'hwupload=extra_hw_frames=64,format=qsv' -c:v hevc_qsv -profile:v main10 output/output/sdr_out.mp4 -y
Ah - you mean ISO of OS? I thought you meant video ISO like Bluray..
Ah - you mean ISO of OS? I thought you meant video ISO like Bluray..
LOL, my bad, the joys of mixed terminologies. Yes, I mean operating system/software version/system firmware
I'm not sure whether I got passion to try different OS versions.
But the one I have is: Operating system: Linux version 5.8.0-53-generic (buildd@lcy01-amd64-012) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #60~20.04.1-
Driver Info: 'TigerLake-LP GT2 Iris Xe Graphics' Id:39497 (Driver: Intel iHD driver for Intel(R) Gen Graphics - 21.2.0 (4436d2f), Vendor: Intel Corporation)
No problem with QSV either, including OpenCL:
No problem with QSV either, including OpenCL:
Interesting, thanks for this!
Could you send the results of
sudo cat /sys/kernel/debug/dri/0/gt/uc/huc_info
sudo cat /sys/kernel/debug/dri/0/gt/uc/guc_info
(This is me doing a sanity check for myself)
Would it also be possible to send the corresponding ffmpeg command of the pipeline from the diagram above? Just to make sure that anything I try on my end is identical.
Ideally we both also use the same source files (perhaps use the Sony demo file I linked near the top of the issue?)
sudo cat /sys/kernel/debug/dri/0/gt/uc/huc_info
sudo cat /sys/kernel/debug/dri/0/gt/uc/guc_info
This was the first I had done and why I said that both are disabled
/opt/emby-server/bin/ffmpeg -loglevel +timing -y -print_graphs_file "/var/lib/emby/logs/ffmpeg-transcode-191bc797-9e2d-4a71-acae-91af9efc102f_1graph.txt" -copyts -start_at_zero -qsv_device /dev/dri/renderD128 -f mp4 -c:v:0 hevc_qsv -hwaccel:v:0 qsv -i "/home/jay/Videos/HDR/Sony Swordsmith HDR UHD 4K Demo.mp4" -filter_complex "[0:0]vpp_qsv@f1=width=640:height=360,setparams@f2=color_primaries=bt2020:color_trc=smpte2084:colorspace=bt2020nc,hwmap@f3=mode=+read:derive_device=opencl,tonemap_opencl@f4=tonemap=hable:format=nv12:desat=0,hwmap@f5=mode=+write:derive_device=qsv:reverse=1:extra_hw_frames=16[f5_out0]" -map [f5_out0] -map 0:1 -sn -c:v:0 h264_qsv -b:v:0 1808000 -g:v:0 180 -maxrate:v:0 1808000 -bufsize:v:0 3616000 -sc_threshold:v:0 0 -level:v:0 31 -keyint_min:v:0 180 -profile:v:0 high -c:a:0 copy -metadata:s:a:0 language=eng -disposition:a:0 default -max_delay 5000000 -avoid_negative_ts disabled -f segment -map_metadata -1 -map_chapters -1 -segment_format mpegts -segment_list "/var/lib/emby/transcoding-temp/DA6020.m3u8" -segment_list_type m3u8 -segment_time 3 -segment_start_number 0 -individual_header_trailer 0 -write_header_trailer 0 -segment_write_temp 1 "/var/lib/emby/transcoding-temp/DA6020_%d.ts"
I'm not sure whether it will work, though. The opencl tonemap filter is different from the regular one and I'm not sure whether the opencl-qsv mapping is working in normal ffmpeg for 10bit
sudo cat /sys/kernel/debug/dri/0/gt/uc/huc_info
sudo cat /sys/kernel/debug/dri/0/gt/uc/guc_info
This was the first I had done and why I said that both are disabled
Hadn't realized that that was how you had confirmed it.
Once we have the same commands and same source files, that should make testing much easier
Edit: just saw the command.
I'll see how much of it is portable/if it's portable.
Probably it's better with a VAAPI command because we have less modifications there:
/opt/emby-server/bin/ffmpeg -y -copyts -start_at_zero -f mp4 -c:v:0 hevc -hwaccel:v:0 vaapi -hwaccel_device:v:0 /dev/dri/renderD128 -hwaccel_output_format:v:0 vaapi -i "/home/jay/Videos/HDR/Sony Swordsmith HDR UHD 4K Demo.mp4" -filter_complex "[0:0]scale_vaapi@f1=w=640:h=360,tonemap_vaapi@f2=format=nv12:matrix=bt709:primaries=bt709:transfer=bt709[f2_out0]" -map [f2_out0] -map 0:1 -sn -c:v:0 h264_vaapi -b:v:0 1808000 -g:v:0 180 -maxrate:v:0 1808000 -bufsize:v:0 3616000 -sc_threshold:v:0 0 -keyint_min:v:0 180 -profile:v:0 high -level:v:0 3.1 -c:a:0 copy -metadata:s:a:0 language=eng -disposition:a:0 default -max_delay 5000000 -avoid_negative_ts disabled -f segment -map_metadata -1 -map_chapters -1 -segment_format mpegts -segment_list "/var/lib/emby/transcoding-temp/794DC8.m3u8" -segment_list_type m3u8 -segment_time 3 -segment_start_number 0 -individual_header_trailer 0 -write_header_trailer 0 "/var/lib/emby/transcoding-temp/794DC8_%d.ts"
Which does this:
Fair point, vaapi first, then QSV?
Also, I need to head soon, but to summarize what we think we know so far:
Alderlake: vaapi has issues prior to kernel 5.18rc1
Qsv has issues in 5.16, 5.17 and 5.18rc1
Tiger lake:
Vaapi works fine as of at least 5.8 Ubuntu, but does not work as of 5.10 Debian
Same applies to QSV
Have you tried with GuC/HuC off?
Also, since we have the location of the emby binary, we could summon it directly.
Alternatively would you be able to build ffmpeg from source?
Alternatively would you be able to build ffmpeg from source?
sure, but why?
Also, since we have the location of the emby binary, we could summon it directly.
It's a bit tricky. You need to run the 'ffmpeg-emby' stub next to the binary, because this will setup all according to the context of the package (which contains its own versions of libva, iHD, etc..)
Alternatively would you be able to build ffmpeg from source?
sure, but why?
If I have time I'm going to dig through the mailing lists and see anything related that might not be in the 5.0.x release yet.
Speaking of which, is emby on the 5.x ffmpeg branch? Or still on 4.x?
It's not a matter of which exact command and it's not a matter of which ffmpeg branch. Either it's working or not (when both, decoding and encoding are done in hw).
Have you tried with GuC/HuC off?
AFK currently, replying from a cab 😅
OMG - the very one and only question and you didn't try....
What I can say for sure is that this is something that must have been introduced just very recently. Otherwise we would have had reports about it before.
It's not a matter of which exact command and it's not a matter of which ffmpeg branch.
Either it's working or not (when both, decoding and encoding are done in hw).
I'm trying to control for as many variables and have as much data as possible to narrow down the issue.
Part of my exhaustive questioning is because I noticed that the hardware, even within different versions of using the same hardware blocks, reacted differently in its crash behaviour. See the vpp_scale command above for example. Still using Qsv encode and decode, failing the same way as other Qsv encode and decode blocks on the same file, and the frame rate command it's executing is actually a no-op because the footage is already 60p. Yet it failed almost instantly
Will be back home in a few, picking someone up
Part of my exhaustive questioning is because I noticed that the hardware, even within different versions of using the same hardware blocks, reacted differently in its crash behaviour. See the vpp_scale command above for example. Still using Qsv encode and decode, failing the same way as other Qsv encode and decode blocks on the same file, and the frame rate command it's executing is actually a no-op because the footage is already 60p. Yet it failed almost instantly
Don't open up so many different dimensions at the same time. You will only confuse yourself and loose focus. I had to actually skip reading many of the posts at the beginning because every second message was about something different and I wasn't able to sort it (mentally) in a reasonable way. Better focus on something small and simple, stick to it and try under different conditions.
running ffffmpeg -hwaccel qsv -c:v hevc_qsv -hwaccel_output_format qsv -i hdr_source.mp4 -vf 'hwupload=extra_hw_frames=64,format=qsv' -c:v hevc_qsv -b:v 10M -profile:v main10 output/output/sdr_out.mp4 -y
Does seem to work fine. It can also be interrupted without issue.
Logs:
sudo cat /etc/modprobe.d/i915.conf
options i915 enable_guc=0
sudo cat /sys/kernel/debug/dri/0/gt/uc/guc_info
GuC firmware: i915/tgl_guc_69.0.3.bin
status: RUNNING
version: wanted 69.0, found 69.0
uCode: 342912 bytes
RSA: 256 bytes
GuC status 0x8003f0ec:
Bootrom status = 0x76
uKernel status = 0xf0
MIA Core status = 0x3
Scratch registers:
0: 0x0
1: 0x163fdf
2: 0x40000
3: 0x4000
4: 0x40
5: 0x2ec8
6: 0x4680000c
7: 0x0
8: 0x0
9: 0x0
10: 0x0
11: 0x0
12: 0x0
13: 0x0
14: 0x0
15: 0x0
GuC log relay not created
sudo cat /sys/kernel/debug/dri/0/gt/uc/huc_info
HuC firmware: i915/tgl_huc_7.9.3.bin
status: RUNNING
version: wanted 7.9, found 7.9
uCode: 589504 bytes
RSA: 256 bytes
HuC status: 0x00090001
Why does it say "RUNNING"?
Why does it say "RUNNING"?
Frankly I'm not certain on this one. but from this: https://01.org/linuxgraphics/downloads/firmware?langredirect=1 it looks that for ADL-P and above as of 5.14 it's automatically enabled when available
I'm on ADL-S, which does not have it on automatically.
We'd have to have one of the Intel Engineers comment.
here's the relevant support matric from the PDF:
Why does it say "RUNNING"?
Frankly I'm not certain on this one.
Even when enabled by default, it should get disabled when specifying 0
Did you reboot after changing the setting?
Why does it say "RUNNING"?
Frankly I'm not certain on this one.
Even when enabled by default, it should get disabled when specifying 0
Did you reboot after changing the setting?
yes I did.
By doing so, the issue also seems to have gone away.
Purely a theory:
Perhaps under normal circumstances the firmware is running, but only loaded when it's specialty features are needed?
but when force loaded the firmware causes conflicts with the primary driver?
HuC is required for low-power encoding and bitrate control, but low-power encoding is not what you normally get with all those ffmpeg commands. HuC is only mandatory with JSL and EHL, because these only support low-power encoding
Perhaps under normal circumstances the firmware is running, but only loaded when it's specialty features are needed? but when force loaded the firmware causes conflicts with the primary driver?
I can't imagine that. Either it gets loaded or not.
You can use dmesg | grep HuC
to get more information and try to find out the difference.
I can't imagine that. Either it gets loaded or not.
[ +0.002855] i915 0000:00:02.0: [drm] GuC firmware i915/tgl_guc_69.0.3.bin version 69.0 [ +0.000002] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9 [ +0.013152] i915 0000:00:02.0: [drm] HuC authenticated [ +0.000000] i915 0000:00:02.0: [drm] GuC submission disabled [ +0.000001] i915 0000:00:02.0: [drm] GuC SLPC disabled [ +0.000705] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
And when you revert the kernel option change?
And when you revert the kernel option change?
as in
sudo cat /etc/modprobe.d/i915.conf
options i915 enable_guc=0
?
that was prior to reboot.
I mean back to the state before where you were seeing the gpu errors.
System information
model name : 12th Gen Intel(R) Core(TM) i7-12700K 00:02.0 VGA compatible controller [0300]: Intel Corporation AlderLake-S GT1 [8086:4680] (rev 0c) no display, render only in ffmpeg
Issue behavior
Describe the current behavior
When using the latest compiled media driver and ffmpeg 5 (also happens on 4.x) with latest drm-tip kernel/linuxfirmware bins (also happens on Ubuntu 20.04 HW kernel), ffmpeg (running under Frigate NVR) will support hw acceleration using either qsv or vaapi decode for somewhere between 10-30 minutes (usually, sometimes longer). After that, it crashes the GPU with this error: [ 4009.472554] i915 0000:00:02.0: [drm] Resetting vcs1 for preemption time out [ 4009.474067] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:28fffffd, in ffmpeg [27844] [ 4020.835642] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:28fffffd, in ffmpeg [27844] [ 4020.836679] i915 0000:00:02.0: [drm] Resetting vcs1 for stopped heartbeat on vcs1 [ 4020.837224] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on vcs1 [ 4020.939613] [drm:uc_sanitize [i915]] ERROR Failed to reset GuC, ret = -110 [ 4021.028683] i915 0000:00:02.0: [drm] ERROR Failed to reset chip [ 4021.028762] i915 0000:00:02.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_res et+0x25b/0x2d0 [i915] [ 4021.131605] [drm:uc_sanitize [i915]] ERROR Failed to reset GuC, ret = -110 [ 4021.133494] i915 0000:00:02.0: [drm] ffmpeg[27844] context reset due to GPU hang [ 4023.672616] ffmpeg[27894]: segfault at 0 ip 0000000000000000 sp 00007fff30a1add8 error 14 i n ffmpeg[556214dda000+b000]
ffmpeg settings: -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -hwaccel_output_format yuv420p
Describe the expected behavior
Not crash.
Debug information
Note re: vainfo, I also tried a new container with ffmpeg and compiled latest version of vainfo, media driver, gmm, everything - same issue.
root@6d859362545b:/opt/frigate# vainfo error: XDG_RUNTIME_DIR not set in the environment. error: can't connect to X server! libva info: VA-API version 1.12.0 libva info: User environment variable requested driver 'iHD' libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_12 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.12 (libva 2.12.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 21.3.3 (6fdf88c) vainfo: Supported profile and entrypoints VAProfileNone : VAEntrypointVideoProc VAProfileNone : VAEntrypointStats VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Simple : VAEntrypointEncSlice VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointEncSlice VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSlice VAProfileH264Main : VAEntrypointFEI VAProfileH264Main : VAEntrypointEncSliceLP VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSlice VAProfileH264High : VAEntrypointFEI VAProfileH264High : VAEntrypointEncSliceLP VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointEncPicture VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice VAProfileH264ConstrainedBaseline: VAEntrypointFEI VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSlice VAProfileHEVCMain : VAEntrypointFEI VAProfileHEVCMain : VAEntrypointEncSliceLP VAProfileHEVCMain10 : VAEntrypointVLD VAProfileHEVCMain10 : VAEntrypointEncSlice VAProfileHEVCMain10 : VAEntrypointEncSliceLP VAProfileVP9Profile0 : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointEncSliceLP VAProfileVP9Profile1 : VAEntrypointVLD VAProfileVP9Profile1 : VAEntrypointEncSliceLP VAProfileVP9Profile2 : VAEntrypointVLD VAProfileVP9Profile2 : VAEntrypointEncSliceLP VAProfileVP9Profile3 : VAEntrypointVLD VAProfileVP9Profile3 : VAEntrypointEncSliceLP VAProfileHEVCMain12 : VAEntrypointVLD VAProfileHEVCMain12 : VAEntrypointEncSlice VAProfileHEVCMain422_10 : VAEntrypointVLD VAProfileHEVCMain422_10 : VAEntrypointEncSlice VAProfileHEVCMain422_12 : VAEntrypointVLD VAProfileHEVCMain422_12 : VAEntrypointEncSlice VAProfileHEVCMain444 : VAEntrypointVLD VAProfileHEVCMain444 : VAEntrypointEncSliceLP VAProfileHEVCMain444_10 : VAEntrypointVLD VAProfileHEVCMain444_10 : VAEntrypointEncSliceLP VAProfileHEVCMain444_12 : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointEncSliceLP VAProfileHEVCSccMain10 : VAEntrypointVLD VAProfileHEVCSccMain10 : VAEntrypointEncSliceLP VAProfileHEVCSccMain444 : VAEntrypointVLD VAProfileHEVCSccMain444 : VAEntrypointEncSliceLP VAProfileAV1Profile0 : VAEntrypointVLD VAProfileHEVCSccMain444_10 : VAEntrypointVLD VAProfileHEVCSccMain444_10 : VAEntrypointEncSliceLP
export LIBVA_TRACE=/tmp/libva_trace.log
first then execute the case.Only useful logs from libva:
/tmp/libva_trace.log.184412.thd-0x0000098e:[54444.273421][ctx 0x10000000]==========va_TraceEndPicture /tmp/libva_trace.log.184412.thd-0x0000098e:[54444.273422][ctx 0x10000000] context = 0x10000000 /tmp/libva_trace.log.184412.thd-0x0000098e:[54444.273422][ctx 0x10000000] render_targets = 0x0000001c /tmp/libva_trace.log.184412.thd-0x0000098e:[54444.273504][ctx none]=========vaEndPicture ret = VA_STATUS_ERROR_DECODING_ERROR, internal decoding error /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245549][ctx 0x10000000]==========va_TraceBeginPicture /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245549][ctx 0x10000000] context = 0x10000000 /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245549][ctx 0x10000000] render_targets = 0x00000019 /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245549][ctx 0x10000000] frame_count = #7 /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245558][ctx 0x10000000]==========va_TraceRenderPicture /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245558][ctx 0x10000000] context = 0x10000000 /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245558][ctx 0x10000000] num_buffers = 2 /tmp/libva_trace.log.184412.thd-0x0000098f:[53500.245559][ctx 0x10000000] --------------
Could you attach dmesg log if it's GPU hang by
dmesg >dmesg.log 2>&1
? [155523.319847] i915 0000:00:02.0: [drm:i915_gem_context_create_ioctl [i915]] HW context 16 created [155534.199385] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:28fffffd, in ffmpeg [102504] [155534.200411] i915 0000:00:02.0: [drm] Resetting vcs0 for stopped heartbeat on vcs0 [155534.200945] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on vcs0 [155534.302952] [drm:uc_sanitize [i915]] ERROR Failed to reset GuC, ret = -110 [155534.394325] i915 0000:00:02.0: [drm] ERROR Failed to reset chip [155534.394347] i915 0000:00:02.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_reset+0x258/0x2d0 [i915] [155534.497281] [drm:uc_sanitize [i915]] ERROR Failed to reset GuC, ret = -110 [155534.499244] i915 0000:00:02.0: [drm] ffmpeg[102504] context reset due to GPU hang [155534.520720] intel_gt_invalidate_tlbs: 36 callbacks suppressed [155534.520734] i915 0000:00:02.0: [drm] ERROR rcs0 TLB invalidation did not complete in 4ms! [155534.525130] i915 0000:00:02.0: [drm] ERROR bcs0 TLB invalidation did not complete in 4ms! [155534.531383] i915 0000:00:02.0: [drm] ERROR rcs0 TLB invalidation did not complete in 4ms! [155534.536543] i915 0000:00:02.0: [drm] ERROR bcs0 TLB invalidation did not complete in 4ms! [155534.540749] i915 0000:00:02.0: [drm] ERROR rcs0 TLB invalidation did not complete in 4ms! [155534.546000] i915 0000:00:02.0: [drm] ERROR bcs0 TLB invalidation did not complete in 4ms! [155534.551252] i915 0000:00:02.0: [drm] ERROR rcs0 TLB invalidation did not complete in 4ms! [155534.556511] i915 0000:00:02.0: [drm] ERROR bcs0 TLB invalidation did not complete in 4msDo you want to contribute a patch to fix the issue? (yes/no):