Closed jackie74 closed 3 months ago
gmmlib info Name : intel-gmmlib Version : 22.3.7 Release : i682.el8_8 Architecture : x86_64 Size : 658 k Source : intel-gmmlib-22.3.7-i682.el8_8.src.rpm Repository : @System From repo : intel-graphics-8.8-unified Summary : Intel Graphics Memory Management Library URL : https://github.com/intel/gmmlib License : MIT and BSD Description : The Intel Graphics Memory Management Library provides device specific : and buffer management for the Intel Graphics Compute Runtime for OpenCL : and the Intel Media Driver for VAAPI.
vainfo log libva info: VA-API version 1.20.0 libva info: Trying to open /usr/lib64/dri/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_19 libva info: va_openDriver() returns 0 Trying display: wayland Trying display: x11 Trying display: drm vainfo: VA-API version: 1.20 (libva 2.20.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.2.4 () vainfo: Supported profile and entrypoints VAProfileNone : VAEntrypointVideoProc VAProfileNone : VAEntrypointStats VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSliceLP VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSliceLP VAProfileJPEGBaseline : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointEncPicture VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSliceLP VAProfileHEVCMain10 : VAEntrypointVLD VAProfileHEVCMain10 : VAEntrypointEncSliceLP VAProfileVP9Profile0 : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointEncSliceLP VAProfileVP9Profile1 : VAEntrypointVLD VAProfileVP9Profile1 : VAEntrypointEncSliceLP VAProfileVP9Profile2 : VAEntrypointVLD VAProfileVP9Profile2 : VAEntrypointEncSliceLP VAProfileVP9Profile3 : VAEntrypointVLD VAProfileVP9Profile3 : VAEntrypointEncSliceLP VAProfileHEVCMain12 : VAEntrypointVLD VAProfileHEVCMain422_10 : VAEntrypointVLD VAProfileHEVCMain422_10 : VAEntrypointEncSliceLP VAProfileHEVCMain422_12 : VAEntrypointVLD VAProfileHEVCMain444 : VAEntrypointVLD VAProfileHEVCMain444 : VAEntrypointEncSliceLP VAProfileHEVCMain444_10 : VAEntrypointVLD VAProfileHEVCMain444_10 : VAEntrypointEncSliceLP VAProfileHEVCMain444_12 : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointVLD VAProfileHEVCSccMain : VAEntrypointEncSliceLP VAProfileHEVCSccMain10 : VAEntrypointVLD VAProfileHEVCSccMain10 : VAEntrypointEncSliceLP VAProfileHEVCSccMain444 : VAEntrypointVLD VAProfileHEVCSccMain444 : VAEntrypointEncSliceLP VAProfileAV1Profile0 : VAEntrypointVLD VAProfileAV1Profile0 : VAEntrypointEncSliceLP VAProfileHEVCSccMain444_10 : VAEntrypointVLD VAProfileHEVCSccMain444_10 : VAEntrypointEncSliceLP
inbva trace libva_trace.tar.gz
Could you confirm all other environment setting are the same including media-driver/libva/kernel version?
And live encoding testing by using msdkh264dec and vah264dec
- gst-launch-1.0 srtsrc uri="srt://127.0.0.1:26001?mode=listener" ! tsdemux ! h264parse ! msdkh264dec ! fakesink
- gst-launch-1.0 srtsrc uri="srt://127.0.0.1:26001?mode=listener" ! tsdemux ! h264parse ! vah264dec ! fakesink When use only msdkh264dec, memory increased is shown.
Desktop PC vs Flex, what's the PC config, iGPU and dGPU? Could you try w/ same kernel/media-driver/libva/gmmlib version to isolate which parameter caused the difference?
Do you see the memory increase from 1st minute or couple minutes later? We have memleak cases running in CI but no issue reported so far. Not sure whether this increase only happened after long execution or not.
During transcoding by using msdkh264dec and msdkh264enc, when I looked at it with a tool like top, there was a case where I could see a steady increase in memory. After about 90 minutes, 1 GiB appeared to have accumulated, and this number continued to increase. Preparing input stream as below
Hi Sherry. Yes. NAVER are using same configurations for both except OS and kernel version. Desktop OS : Ubuntu 22.04.4 LTS kernel : 6.5.0-28-generic Flex OS : Red Hat Enterprise Linux release 8.8 (Ootpa) kernel : 4.18.0-477.27.1.el8_8.x86_64
During transcoding by using msdkh264dec and msdkh264enc, when I looked at it with a tool like top, there was a case where I could see a steady increase in memory. After about 90 minutes, 1 GiB appeared to have accumulated, and this number continued to increase. Preparing input stream as below ffmpeg -stream_loop -1 -re -i ~/data/obama/obama2.flv -c copy -f mpegts srt://127.0.0.1:26001
And live encoding testing by using msdkh264dec and vah264dec
If use multi-transcoding, you could find current issue quickly as below gst-launch-1.0 srtsrc uri="srt://127.0.0.1:26001?mode=listener" ! tsdemux ! h264parse ! msdkh264dec ! tee name=tdec tdec. ! msdkvpp ! 'video/x-raw,width=1920,height=1080' ! msdkh264enc bitrate=8000 ! h264parse ! mpegtsmux ! fakesink tdec. ! msdkvpp ! 'video/x-raw,width=1280,height=720' ! msdkh264enc bitrate=2000 ! h264parse ! mpegtsmux ! fakesink tdec. ! msdkvpp ! 'video/x-raw,width=854,height=480' ! msdkh264enc bitrate=600 ! h264parse ! mpegtsmux ! fakesink tdec. ! msdkvpp ! 'video/x-raw,width=480,height=272' ! msdkh264enc bitrate=300 ! h264parse ! mpegtsmux ! fakesink tdec. ! msdkvpp ! 'video/x-raw,width=256,height=144' ! msdkh264enc bitrate=100 ! h264parse ! mpegtsmux ! fakesin
Do you see the memory increase from 1st minute or couple minutes later? We have memleak cases running in CI but no issue reported so far. Not sure whether this increase only happened after long execution or not.
During transcoding by using msdkh264dec and msdkh264enc, when I looked at it with a tool like top, there was a case where I could see a steady increase in memory. After about 90 minutes, 1 GiB appeared to have accumulated, and this number continued to increase. Preparing input stream as below
Hi Sherry. NAVER checked 1 GiB memory increased after long execution as about 90min. Please compared memory usage between before and after long execution as about 90min.
NAVER told that memory usage is restored to normal after encoding program is completed Thank you.
Hi Sherry. Yes. NAVER are using same configurations for both except OS and kernel version. Desktop OS : Ubuntu 22.04.4 LTS kernel : 6.5.0-28-generic Flex OS : Red Hat Enterprise Linux release 8.8 (Ootpa) kernel : 4.18.0-477.27.1.el8_8.x86_64
Ubuntu vs Redhat, still have some version is not clear. could you update it as below? @jackie74
Ubuntu: Good
Redhat: Bad
@xhaihao do you have gst-msdk memleak cases running in release cycle? From current report, gst-vaapi looks good, no memleak issue, but gst-msdk has memleak after 90min execution.
NAVER checked 1 GiB memory increased after long execution as about 90min. Please compared memory usage between before and after long execution as about 90min.
@Sherry-Lin No, we don't have.
@jackie74 Could you try a new version of vpl-gpu-rt (https://github.com/intel/vpl-gpu-rt) ? Someone had a similar issue when using FFmpeg QSV for video transcoding, the memory leak disappeared after upgrading vpl-gpu-rt.
@jackie74 Could you try a new version of vpl-gpu-rt (https://github.com/intel/vpl-gpu-rt) ? Someone had a similar issue when using FFmpeg QSV for video transcoding, the memory leak disappeared after upgrading vpl-gpu-rt.
Hi Haihao. NAVER are using 23.2.4 which was release last year Aug. so I asked NAVER to change the latest version(24.1.5). I also upgraded 24.1.5 and am monitoring on my system. Thank you.
Hi All. NAVER upgraded vpl-gpu-rt 24.1.5 and have monitored memory usage during 2 hours, memory usage are not increased. So they will upgrade vpl-gup-rt. The latest version of vpl-gup-rt is 24.2.5, and which version do you recommend NAVER to upgrade? https://github.com/intel/vpl-gpu-rt/tags Thank you.
24.1.5 and 24.2.5 are our Q1 and Q2 release build. Q2 majorly introduced some AV1/VVC related enhancement. If NAVER not use these codec, it's ok to keep it as 24.1.5 version.
Hi Sherry. Thanks for explain and I will close this issue. Thanks everyone.
Which component impacted?
Decode, Encode
Is it regression? Good in old configuration?
None
What happened?
NAVER are using total 400 x Flex GPU for their live streaming service as CHZZK(https://chzzk.naver.com/) and they found issue related to memory.
During transcoding by using msdkh264dec and msdkh264enc, when I looked at it with a tool like top, there was a case where I could see a steady increase in memory. After about 90 minutes, 1 GiB appeared to have accumulated, and this number continued to increase. Preparing input stream as below ffmpeg -stream_loop -1 -re -i ~/data/obama/obama2.flv -c copy -f mpegts srt://127.0.0.1:26001
And live encoding testing by using msdkh264dec and vah264dec
NAVER also tested on desktop PC with same configuration
Video samples are always reproduced regardless of what they are, and both h264 and h265 codecs are reproduced.
If use multi-transcoding, you could find current issue quickly as below gst-launch-1.0 srtsrc uri="srt://127.0.0.1:26001?mode=listener" ! tsdemux ! h264parse ! msdkh264dec ! tee name=tdec \ tdec. ! msdkvpp ! 'video/x-raw,width=1920,height=1080' ! msdkh264enc bitrate=8000 ! h264parse ! mpegtsmux ! fakesink \ tdec. ! msdkvpp ! 'video/x-raw,width=1280,height=720' ! msdkh264enc bitrate=2000 ! h264parse ! mpegtsmux ! fakesink \ tdec. ! msdkvpp ! 'video/x-raw,width=854,height=480' ! msdkh264enc bitrate=600 ! h264parse ! mpegtsmux ! fakesink \ tdec. ! msdkvpp ! 'video/x-raw,width=480,height=272' ! msdkh264enc bitrate=300 ! h264parse ! mpegtsmux ! fakesink \ tdec. ! msdkvpp ! 'video/x-raw,width=256,height=144' ! msdkh264enc bitrate=100 ! h264parse ! mpegtsmux ! fakesink
Belows are information and gstreamer initialize log on desktop pc and server with flex
desktop pc
os / kernel
OS : Ubuntu 22.04.4 LTS kernel : 6.5.0-28-generic
gstreamer initialize log
msdk msdk.c:294:msdk_init_msdk_session:^[[00m Use the Intel(R) Media SDK to create MFX session msdk msdk.c:380:msdk_open_session:^[[00m MFX implementation: 0x0402 (HARDWARE) msdk msdk.c:381:msdk_open_session:^[[00m MFX version: 1.255 msdkcontext gstmsdkcontext.c:143:get_device_path:^[[00m Opened the drm device node /dev/dri/renderD128 vadisplay gstvadisplay_drm.c:156:gst_va_display_drm_create_va_display:^[[00m DRM render node with kernel driver i915
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: VA-API version 1.20.0
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: Found init function __vaDriverInit_1_20
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: va_openDriver() returns 0
vadisplay gstvadisplay.c:320:gst_va_display_initialize:^[[00m VA-API version 1.20
vadisplay gstvadisplay.c:102:_gst_va_display_filter_driver:^[[00m VA-API driver vendor: Intel iHD driver for Intel(R) Gen Graphics - 23.4.3 ()
msdkcontext gstmsdkcontext.c:343:gst_msdk_context_open:^[[00m Detected MFX platform with device code 43
flex
os / kernel
OS : Red Hat Enterprise Linux release 8.8 (Ootpa) kernel : 4.18.0-477.27.1.el8_8.x86_64
gstreamer initialize log
msdk msdk.c:294:msdk_init_msdk_session:^[[00m Use the Intel(R) Media SDK to create MFX session msdk msdk.c:380:msdk_open_session:^[[00m MFX implementation: 0x0402 (HARDWARE) msdk msdk.c:381:msdk_open_session:^[[00m MFX version: 1.255 msdkcontext gstmsdkcontext.c:100:get_device_path:^[[00m Opened the specified drm device /dev/dri/renderD131 vadisplay gstvadisplay_drm.c:156:gst_va_display_drm_create_va_display:^[[00m DRM render node with kernel driver i915
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: VA-API version 1.20.0
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: Trying to open /usr/lib64/dri/iHD_drv_video.so
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: Found init function __vaDriverInit_1_19
vadisplay gstvadisplay.c:268:_va_info:^[[00m VA info: va_openDriver() returns 0
vadisplay gstvadisplay.c:320:gst_va_display_initialize:^[[00m VA-API version 1.20
vadisplay gstvadisplay.c:102:_gst_va_display_filter_driver:^[[00m VA-API driver vendor: Intel iHD driver for Intel(R) Gen Graphics - 23.2.4 ()
msdkcontext gstmsdkcontext.c:343:gst_msdk_context_open:^[[00m Detected MFX platform with device code 46
What's the usage scenario when you are seeing the problem?
Transcode for media delivery
What impacted?
NAVER are using total 400 x Flex GPU for their live streaming service as CHZZK(https://chzzk.naver.com/) They found memory usage increase issue, especially some user send video streaming during very long time as about 1~2 day. So sometimes memory usage of system memory becomes full and it causes system and service failure.
Debug Information
No response
Do you want to contribute a patch to fix the issue?
None