open-webrtc-toolkit / owt-server

General server (streaming/conference/transcoding/anayltics) for OWT. (A.k.a. MediaServer)
https://01.org/open-webrtc-toolkit
Apache License 2.0
1.14k stars 371 forks source link

Issue with GPU Accelerated Server #556

Open Alex-YddeR opened 4 years ago

Alex-YddeR commented 4 years ago

We have around 25 users logged in with mix at standard resolution and output resolution at 1280x720 at 1024 Kbps Intel MCU 4.2.1 Ubuntu 18.04 GPU Accelerated.

When intel_gpu_top i can see render busy status at 75 percent and suddenly it drops to 5 percent during that time all videos get frozen and come back in some time.

What should be the issue. ? we have enabled 25 input limit in views.

Thanks

Alex-YddeR commented 4 years ago

Does SFU also takes the GPU Process??

Thanks

Alex-YddeR commented 4 years ago

Error in video agent as below logs 2020-06-27 21:33:32,062 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -15 2020-06-27 21:33:32,102 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -2(5): mfx vpp error, ret -16 2020-06-27 21:33:32,102 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -16 2020-06-27 21:33:32,142 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -4(5): mfx vpp error, ret -15 2020-06-27 21:33:32,142 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -15 2020-06-27 21:33:32,182 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -2(5): mfx vpp error, ret -16 2020-06-27 21:33:32,182 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -16

Thanks

daijh commented 4 years ago

Does SFU also takes the GPU Process??

Thanks

SFU does not take GPU process.

daijh commented 4 years ago

Error in video agent as below logs 2020-06-27 21:33:32,062 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -15 2020-06-27 21:33:32,102 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -2(5): mfx vpp error, ret -16 2020-06-27 21:33:32,102 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -16 2020-06-27 21:33:32,142 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -4(5): mfx vpp error, ret -15 2020-06-27 21:33:32,142 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -15 2020-06-27 21:33:32,182 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Input -2(5): mfx vpp error, ret -16 2020-06-27 21:33:32,182 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x25962e0)Composite failed, ret -16

Thanks

Please help to provide more information:

  1. Does the error always happen, or occasionally?
  2. What is the hardware. "lscpu"
  3. What is the publish codec, vp8 or h264?
Alex-YddeR commented 4 years ago
  1. It always happen. also when try to disable the local video blue avatar image will completly zooms in, and if SFU user disables his local video, all the users who are in mix video gets stopped/paused.

  2. Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 94 Model name: Intel(R) Xeon(R) CPU E3-1585L v5 @ 3.00GHz Stepping: 3 CPU MHz: 899.897 CPU max MHz: 3700.0000 CPU min MHz: 800.0000 BogoMIPS: 6000.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K NUMA node0 CPU(s): 0-7

  3. VP8

Thanks

daijh commented 4 years ago

Also what is the MediaSDK revision? We will try to reproduce.

Alex-YddeR commented 4 years ago

Media SDK Version: 18.14.0 OS : Ubuntu 18.04 VA Info:

libva info: VA-API version 1.4.0 libva info: va_getDriverName() returns 0 libva info: User requested driver 'iHD' libva info: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_4 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.4 (libva 2.1.0) vainfo: Driver version: Intel iHD driver - 18.4.git_e61f5c_2019-01-29 vainfo: Supported profile and entrypoints VAProfileNone : VAEntrypointVideoProc VAProfileNone : VAEntrypointStats VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Simple : VAEntrypointEncSlice VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointEncSlice VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSlice VAProfileH264Main : VAEntrypointFEI VAProfileH264Main : VAEntrypointEncSliceLP VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSlice VAProfileH264High : VAEntrypointFEI VAProfileH264High : VAEntrypointEncSliceLP VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointEncPicture VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice VAProfileH264ConstrainedBaseline: VAEntrypointFEI VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP VAProfileVP8Version0_3 : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSlice VAProfileHEVCMain : VAEntrypointFEI

daijh commented 4 years ago

The hardware MCU does not implement the acceleration for VP8/VP9, the recommended codec for HW MCU is H264. Is there any special reason you prefer to VP8? H264 will provide more power saving.

Alex-YddeR commented 4 years ago

Ok, we used VP8 codec only. We will check this with h264 and update.

Thanks

Alex-YddeR commented 4 years ago

SFU does not take GPU process.

Hi,

I can still see SFU taking GPU Process. can you please let me know how to avoid the GPU process for SFU mode

Thanks

Alex-YddeR commented 4 years ago

Problem Still Exits even h264 below are the logs. when some mutes the video it mutes to everyoe with below logs

2020-07-08 15:34:08,009 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -1(6): mfx vpp error, ret -15 2020-07-08 15:34:08,009 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,049 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -1(6): mfx vpp error, ret -15 2020-07-08 15:34:08,049 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,089 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,089 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,129 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,129 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,169 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,169 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,210 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,210 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,250 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,250 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,290 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,290 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,330 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,330 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,370 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,370 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,410 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,410 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,450 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,450 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,490 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,490 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,530 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,530 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,570 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,570 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,610 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,610 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,650 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,650 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15 2020-07-08 15:34:08,690 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Input -0(6): mfx vpp error, ret -15 2020-07-08 15:34:08,690 - ERROR: mcu.media.MsdkVideoCompositor.MsdkVpp - (0x41d65e0)Composite failed, ret -15

Alex-YddeR commented 4 years ago

Disabling Hardware acceleration on same MCU works fine.

cisco66666 commented 4 years ago

I started a 4.2.1 mcu on ubuntu. In hardware, I connected 25 users to publish an hd720p stream and mix it into the mixed stream, sub a hd720p mix stream, observe the flow of 25 users in the mix stream , Under observation, expect to observe for 4 hours to see if mcu has errors

taste1981 commented 4 years ago

SFU does not take GPU process. For the server to work fully in SFU mode, you need to turn off mixing in MCU console, and when subscribing streams, do not specify any code/resolution/fps.

Alex-YddeR commented 4 years ago

Does MCU supports below accelerated card for Graphics Accelerated support? https://www.nimbix.net/alveo a

cisco66666 commented 4 years ago

I started a 4.2.1 and 4.3.1 mcu on ubuntu18.04. In hardware, I connected 25 users to publish an hd720p stream and mix it into the mixed stream, sub a hd720p mix stream, observe the flow of 25 users in the mix stream , Under observation, expect to observe for 4 hours to see if mcu has errors cpu:Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz kernel:5.3.0-51-generic The mcu of 4.2.1 and 4.3.1 is as expected

Alex-YddeR commented 4 years ago

Try to disable and enable videos multiple times, avatar gets zoomed in completly and later all video gets stucked even single user disabled his video.

Thanks

daijh commented 4 years ago

Does MCU supports below accelerated card for Graphics Accelerated support? https://www.nimbix.net/alveo a

It does not support directly. The MCU is open to extent additional accelerated cards support by implement a new decoder/encoder/mixing module by developer. Welcome to implement any additional accelerated cards and contribution.

daijh commented 4 years ago

The latest stable release is v4.3.1, and the GPU acceleration feature is fully tested with below configuration.

Table 2-1. Server requirements Application name OS version
OWT server CentOS* 7.6, Ubuntu 18.04 LTS

The GPU-acceleration can only be enabled on kernel 4.14 or later (4.19 or later is recommended).

If you are working on the following platforms with the integrated graphics, please install Intel® Media SDK. The current release is fully tested on MediaSDK 2018 Q4(https://github.com/Intel-Media-SDK/MediaSDK/releases/tag/intel-mediasdk-18.4.0).

  • Intel® Xeon® E3-1200 v4 Family with C226 chipset
  • Intel® Xeon® E3-1200 and E3-1500 v5 Family with C236 chipset
  • 5th Generation Intel® CoreTM
  • 6th Generation Intel® CoreTM
  • 7th Generation Intel® CoreTM

https://github.com/open-webrtc-toolkit/owt-server/blob/v4.3.1/doc/servermd/Server.md

Suggest to sync your environment to latest recommended configuration, described as above.

25-participants mixing usage was not covered by our previous test case. In recent testing with 25-participants, no error happens for 4-hours. We will reproduce this issue with longer testing and provide a fixing once reproduced on v4.3.1.

Alex-YddeR commented 4 years ago

Ok, I will check this.

did you checked to disable and enable video multiple times, avatar gets zoomed in completely and later all video gets Freeze even single user disabled his video.

Thanks

Alex-YddeR commented 4 years ago

Error Log for video-agent with hardware accelerated enabled when trying to mute and unmute publication video.

Scenario: OS Ubuntu 18.04 LTS/CentOS 7.8 Version 4.3.x/Master hardwareAccelerated = true

Join to the room with multiple participants, and now mute publication video and unmute multiple times, now watch avatar gets zoom and later even any participant mute the video its get everywhere muted.

Got below errors in video agent

2020-09-28 23:11:32,297[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,297[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,362[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,362[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,426[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,426[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,491[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,491[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,555[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,555[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,619[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,620[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,684[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,684[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,748[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,748[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,813[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,813[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,877[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,877[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,942[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,942[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:33,006[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:33,006[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:33,070[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:33,071[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:33,135[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:33,135[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:33,199[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:33,199[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15

daijh commented 4 years ago

Ok, I will check this.

did you checked to disable and enable video multiple times, avatar gets zoomed in completely and later all video gets Freeze even single user disabled his video.

Thanks

Yes, we verify it with no problem for 4 hours. We will extend the testing duration.

daijh commented 4 years ago

Error Log for video-agent with hardware accelerated enabled when trying to mute and unmute publication video.

Scenario: OS Ubuntu 18.04 LTS/CentOS 7.8 Version 4.3.x/Master hardwareAccelerated = true

Join to the room with multiple participants, and now mute publication video and unmute multiple times, now watch avatar gets zoom and later even any participant mute the video its get everywhere muted.

Got below errors in video agent

2020-09-28 23:11:32,297[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,297[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,362[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,362[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,426[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,426[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,491[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,491[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,555[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,555[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,619[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,620[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,684[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,684[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,748[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,748[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,813[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,813[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:32,877[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:32,877[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:32,942[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:32,942[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:33,006[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:33,006[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:33,070[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:33,071[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15 2020-09-28 23:11:33,135[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -16 2020-09-28 23:11:33,135[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -16 2020-09-28 23:11:33,199[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Input -1(2): mfx vpp error, ret -15 2020-09-28 23:11:33,199[0x7f50c19d0700] ERROR mpositor.MsdkVpp - (0x393ea80)Composite failed, ret -15

Please expect slow update, because some team members are on vacation.

Alex-YddeR commented 4 years ago

Hi Team,

Any progress on this? @lzhai i believe you are ale to reproduce the issue. any update on the fix?

Thanks

daijh commented 4 years ago

@Alex-YddeR Yes, we reproduced this issue with 25-users. It seems the video resolution change cause hardware video frame is not released properly. A fixing is still in investigation.

daijh commented 4 years ago

@Alex-YddeR Please update MediaSDK from 18.4.0 to 20.3.0 (backward compatible) to mitigate this gpu issue. The stress test is still in progress. But this gpu issue does not happen in 2-hours.

https://github.com/Intel-Media-SDK/MediaSDK/releases/tag/intel-mediasdk-20.3.0

  1. Delete /opt/intel/mediasdk
  2. Install intel-mediasdk-20.3.0
  3. Reboot
  4. Start v4.3.1 tag or 4.3.x branch mcu

libva info: VA-API version 1.9.0 libva info: User environment variable requested driver 'iHD' libva info: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_9 libva info: va_openDriver() returns 0 2020-10-13 10:27:01,525 - INFO: owt.MsdkBase - VA-API Version: 1.4 2020-10-13 10:27:01,525 - WARN: owt.MsdkBase - VA-API Runtime Version: 1.9 2020-10-13 10:27:01,526 - INFO: owt.MsdkBase - MFX Version: 1028, major(1), minor(28) 2020-10-13 10:27:01,526 - WARN: owt.MsdkBase - MFX Runtime Version: 1034, major(1), minor(34)

Alex-YddeR commented 4 years ago

@daijh Thanks for the update. we will update the same and verify the issue.

Thanks

daijh commented 4 years ago

With MediaSDK 20.3.0 upgrade, it pass our 24 hours test.

https://github.com/Intel-Media-SDK/MediaSDK/releases/tag/intel-mediasdk-20.3.0

Alex-YddeR commented 4 years ago

Hi daijh,

When we mute publication video and stop the local video tracks in stream, then server zooming out the disabled user avatar and muting all videos with Hardware acceleration MCU, but with normal MCU this works well.

use case here is we want to stop video tracks to send to MCU during video mute to save the server and client resources and bandwidth.

Please check by muting publication and also stop local video tracks with multiple users login then you will be able to reproduce the issue.

Thanks

Alex-YddeR commented 4 years ago

Hi @daijh,

Have you got time to test the above scenario with GPU accelerated MCU.

Thanks

daijh commented 4 years ago

@Alex-YddeR Does the MediaSDK 20.3.0 upgrade fix this issue by your test? If issue still existed, we will try this new test steps.

daijh commented 4 years ago

The MediaSDK 20.3.0 upgrade is strongly recommended.

Alex-YddeR commented 4 years ago

@daijh

I have upgraded the MSDK below is the info for the same, issue still exists.. there is no issue when we just mute publication. Please follow the steps which i have mentioned earlier.

libva info: VA-API version 1.9.0 libva info: User environment variable requested driver 'iHD' libva info: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so libva info: Found init function __vaDriverInit_1_9 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.9 (libva 2.1.0) vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 20.3.0 (86ec0b6f) vainfo: Supported profile and entrypoints

we use below code for muting video tracks along with publication mute .

image

Thanks

daijh commented 4 years ago

@Alex-YddeR Thank for the confirmation. We will verify with your new test steps.

daijh commented 3 years ago

@Alex-YddeR With your latest steps, the gpu issue can be reproduced with MediaSDK 18.4.0 with below code. After upgrade to MediaSDK 20.3.0, this issue does not happen again (> 4 hours).

// mute and disable
publication.mute('video').then(() => {
  track = mediaStream.getVideoTracks()[0];
  track.enabled = false;
}

// unmute and enable
track = mediaStream.getVideoTracks()[0];
track.enabled = true;
publication.unmute('video');

The test environment:

Server Intel(R) Xeon(R) CPU E3-1585 v5 @ 3.50GHz Ubuntu 18.04.5 LTS Linux version 5.4.0-49-generic owt-server v4.3.1 release MediaSDK 20.3.0

Client Chrome Version 85.0.4183.102

Please share your whole logs folder after this issue happens.

# clean old logs
rm Release-v4.3.1/logs/* -rfv

# run the test
# share the Release-v4.3.1/logs after issue happens
Alex-YddeR commented 3 years ago

@daijh Thanks for the verification, i will test this and report if any issues. Is there will be any issue if we first unmute publication and then enable track.? may be this could be the issue? anyways i was using Linux 4.15.0-118-generic, i will update this too and check.

Thanks

Alex-YddeR commented 3 years ago

Problem still exists with MediaSDK 20.3.0

Thanks