louisper-intel commented 1 year ago

ffmpeg-hls-client hitting bad gateway on >1 stream in demo. This is the to-left client pane - sometimes 2/4 streams will run, sometimes 1/4:

Every 1.0s: bash -c watch_pids 3712d5e110d9: Tue Mar 28 05:32:22 2023

ffmpeg streaming clients monitor

Output and logs path: /opt/data/artifacts/ffmpeg-hls-client Total clients: 4 Running clients: 2 vod/avc/WAR_TRAILER_HiQ_10_withAudio-1: size=21M, frames=1023, fps=24 vod/avc/WAR_TRAILER_HiQ_10_withAudio-2: size=21M, frames=1023, fps=24 Completed clients: 2 vod/avc/WAR_TRAILER_HiQ_10_withAudio-3: size=, frames=, fps=, status=1 vod/avc/WAR_TRAILER_HiQ_10_withAudio-4: size=, frames=, fps=, status=1

CTRL^C to exit monitor and enter shell

The logs in /opt/data/artifacts/ffmpeg-hls-client shows these for the failed ones: [http @ 0x5648f6f7e7c0] HTTP error 502 Bad Gateway http://localhost:8080/vod/avc/WAR_TRAILER_HiQ_10_withAudio-3/index.m3u8: Server returned 5XX Server Error reply

I am running the multistream script via docker as per instructions

dvrogozh commented 1 year ago

Can you, please, provide your docker run command line? Specifically – did you limit CPU or memory resources in any way? Would be easier if you will just post an example cmdline.

If you did not limit resources, I would check the following. Demo works in this way:

Client (the one you see in top-left pannel) tires to access a stream (or streams)
Nginx server gets a request and proxyies request to HTTP server. Nginx server waits for response
HTTP server triggers transcoding and also waits till HLS output will be available. Once done it notifies Nginx server. And then Nginx server notifies a client.

So, there are 2 wait operations involved. One is configured in nginx here: https://github.com/intel/media-delivery/blob/ab68f633c71f0c1de6b3e93368e1de361cd5caa0/samples/cdn/nginx.conf#L66 and is for 12 seconds. And another is in HTTP server for 10 seconds: https://github.com/intel/media-delivery/blob/ab68f633c71f0c1de6b3e93368e1de361cd5caa0/samples/cdn/nginx-trigger-streaming.sh#L38.

Since you note behavior is random, my first clue would be that something goes wrong with these 2 timeouts. Maybe your system is busy and some operations take longer or we stepped into some other effect. But try to bring these 2 timeouts more apart. For example, try to increase nginx timeout to, say 30 seconds and increase HTTP one to 15 seconds. Will issue you observe be less frequent? Mind that this can increase await latency for when streams will be available at the first place, so give it a time.

louisper-intel commented 1 year ago

Hi,

Thanks for the help – pushing the HTTP and nginx timeouts apart to 30 & 15s did the trick for me. The command I used in all runs was this – no explicit resource limits from me apart from what docker might have set on its own:

#!/bin/bash

DEVICE=${DEVICE:-/dev/dri/renderD128}
DEVICE_GRP=$(stat --format %g $DEVICE)
docker run --rm -it \
  -e DEVICE=$DEVICE --device $DEVICE --group-add $DEVICE_GRP \
  --cap-add SYS_ADMIN \
  -p 8080:8080 \
  intel-media-delivery demo -4 \
    http://localhost:8080/vod/avc/WAR_TRAILER_HiQ_10_withAudio-1/index.m3u8 \
    http://localhost:8080/vod/avc/WAR_TRAILER_HiQ_10_withAudio-2/index.m3u8 \
    http://localhost:8080/vod/avc/WAR_TRAILER_HiQ_10_withAudio-3/index.m3u8 \
    http://localhost:8080/vod/avc/WAR_TRAILER_HiQ_10_withAudio-4/index.m3u8

intel / media-delivery

ffmpeg-hls-client demo hitting 502 bad gateway on multiple streams #157

ffmpeg streaming clients monitor