intel / media-driver

Intel Graphics Media Driver to support hardware decode, encode and video processing.
https://github.com/intel/media-driver/wiki
Other
1.01k stars 346 forks source link

[Bug]: SIGSEGV mos_gpucontext_specific_next #1840

Closed sizeofvoid closed 2 months ago

sizeofvoid commented 3 months ago

Which component impacted?

Video Processing

Is it regression? Good in old configuration?

No, this issue exist a long time

What happened?

ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i AVC-H.264-TheaterSquare_3840x2160.mp4 -vf 'format=nv12,hwupload' -c:v h264_vaapi -f mpegts - |ffplay -

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000003a7b9a4ba40 in memcpy (dst0=0x3a86501beb0, src0=<optimized out>, length=328) at /usr/src/lib/libc/string/memcpy.c:103
103             TLOOP(*(word *)dst = *(word *)src; src += wsize; dst += wsize);
[Current thread is 1 (process 430087)]
(gdb) bt
#0  0x000003a7b9a4ba40 in memcpy (dst0=0x3a86501beb0, src0=<optimized out>, length=328) at /usr/src/lib/libc/string/memcpy.c:103
#1  0x000003a8322188b4 in GpuContextSpecificNext::GetOcaRTLogResource (this=0x3a86501b2b0, globalInst=0x0) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_softlet/linux/common/os/mos_gpucontext_specific_next.cpp:1979
#2  0x000003a832221a78 in MosInterface::GetRtLogResourceInfo (osInterface=0x3a7dd638000, osResource=@0x79920aa5ba58: 0x0, size=@0x79920aa5ba50: 0)
    at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_softlet/linux/common/os/mos_interface.cpp:4085
#3  0x000003a83295edc4 in HalOcaInterfaceNext::AddRTLogReource (cmdBuffer=..., mosContext=0x3a83d3cd000, osInterface=...)
    at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_softlet/linux/common/shared/hal_oca_interface_next.cpp:542
#4  0x000003a8324fdf6b in CodechalEncodeAvcEnc::BrcInitResetKernel (this=0x3a7ba17e000) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/agnostic/common/codec/hal/codechal_encode_avc.cpp:3111
#5  0x000003a83262448c in CodechalEncodeAvcEncG12::ExecuteKernelFunctions (this=0x3a7ba17e000) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/agnostic/gen12/codec/hal/codechal_encode_avc_g12.cpp:1805
#6  0x000003a8324e0a27 in CodechalEncoderState::ExecuteEnc (this=0x3a7ba17e000, encodeParams=0x3a7f3957bc8) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/agnostic/common/codec/hal/codechal_encoder_base.cpp:5061
#7  0x000003a8324e00c9 in CodechalEncoderState::Execute (this=0x3a7ba17e000, params=0x3a7f3957bc8) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/agnostic/common/codec/hal/codechal_encoder_base.cpp:685
#8  0x000003a83240142a in DdiEncodeAvc::EncodeInCodecHal (this=<optimized out>, numSlices=1) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/linux/common/codec/ddi/media_ddi_encode_avc.cpp:1151
#9  0x000003a8323e9131 in DdiEncodeBase::EndPicture (this=0x3a86501beb0, ctx=<optimized out>, context=<optimized out>)
    at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/linux/common/codec/ddi/media_ddi_encode_base.cpp:77
#10 0x000003a8323ee3d0 in DdiEncode_EndPicture (ctx=0x3a854a1f200, context=536870912) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/linux/common/codec/ddi/media_libva_encoder.cpp:641
#11 0x000003a8323d0bc5 in DdiMedia_EndPicture (ctx=0x3a854a1f200, context=536870912) at /usr/ports/pobj/intel-media-driver-24.2.5/media-driver-intel-media-24.2.5/media_driver/linux/common/ddi/media_libva.cpp:4041
#12 0x000003a797e1f07e in vaEndPicture () from /usr/X11R6/lib/libva.so.2.22
#13 0x000003a833756782 in ff_vaapi_encode_receive_packet () from /usr/local/lib/libavcodec.so.25.0
#14 0x000003a833247732 in encode_receive_packet_internal () from /usr/local/lib/libavcodec.so.25.0
#15 0x000003a833247627 in avcodec_send_frame () from /usr/local/lib/libavcodec.so.25.0
#16 0x000003a58939d32a in do_video_out ()
#17 0x000003a58939c0ec in reap_filters ()
#18 0x000003a58939594a in transcode ()
#19 0x000003a5893918c2 in main ()

Simple plug:

Index: media_softlet/linux/common/os/mos_gpucontext_specific_next.cpp
--- media_softlet/linux/common/os/mos_gpucontext_specific_next.cpp.orig
+++ media_softlet/linux/common/os/mos_gpucontext_specific_next.cpp
@@ -1974,7 +1974,7 @@ PMOS_RESOURCE GpuContextSpecificNext::GetOcaRTLogResou
     // than 2 video processors, the value may be overwritten and wrong allocation Index in array may be used.
     // To avoid this, use duplicate MOS_RESOURCE instance in GPU Context to ensure differnt iAllocationIndex
     // array of OcaRTLogResources being used for different GPU Context.
-    if (!m_ocaRtLogResInited)
+    if (!m_ocaRtLogResInited && globalInst)
     {
         m_ocaRtLogResource = *globalInst;
         m_ocaRtLogResInited = true

What's the usage scenario when you are seeing the problem?

Playback

What impacted?

No response

Debug Information

dmesg.log vainfo.log

ls -l /dev/dri 
crw-------  1 rsadowski  rsadowski  87,   0 Aug  1 21:07 card0
crw-------  1 root       wheel      87,   1 Aug  1 21:07 card1
crw-------  1 root       wheel      87,   2 Aug  1 21:07 card2
crw-------  1 root       wheel      87,   3 Aug  1 21:07 card3
crw-------  1 rsadowski  rsadowski  87, 128 Aug  1 21:07 renderD128
crw-------  1 root       wheel      87, 129 Aug  1 21:07 renderD129
crw-------  1 root       wheel      87, 130 Aug  1 21:07 renderD130
crw-------  1 root       wheel      87, 131 Aug  1 21:07 renderD131

Do you want to contribute a patch to fix the issue?

Yes, I'm glad to submit a patch to fix it

intel-mediadev commented 3 months ago

Auto Created VSMGWL-75563 for further analysis.

VincentCheungKokomo commented 3 months ago

@sizeofvoid Hi, i'm trying to reproduce on my gta env, could you provide the mp4 file? (AVC-H.264-TheaterSquare_3840x2160.mp4)

sizeofvoid commented 3 months ago

I tested it with the videos from https://www.elecard.com/videos: https://www.elecard.com/storage/video/TheaterSquare_3840x2160.mp4

VincentCheungKokomo commented 3 months ago

Hi @sizeofvoid , i can't reproduce the segmentation fault issue, seems running well on my gta environment, may i know which driver/ffmpeg version and hardware platform you are using when get the segmentation fault problem. TheaterSquare image

VincentCheungKokomo commented 3 months ago

PR merged to comp/media : https://github.com/intel-innersource/drivers.gpu.unified/pull/177394

Effieyu0 commented 2 months ago

The code is merged on commentment: 4461a2703eae517f678ed8d8270ff14f356b9b19. Could you please help to verify it.