ossrs / oryx

Oryx(SRS Stack) is an all-in-one, out-of-the-box, and open-source video solution for creating online video services, including live streaming and WebRTC, on the cloud or through self-hosting.
https://ossrs.io/oryx
MIT License
504 stars 106 forks source link

Transcript: Queue is stuck when FFmpeg overlay failed #200

Open limjoe opened 2 months ago

limjoe commented 2 months ago

Describe the bug Use Camera to forward the AI HLS overlay stream, when occur some error, the AI transcription seems to be blocked, and no more transcription result, and the Live Segments Queue became more and more longer

It seems that the original live stream is interrupted for a moment, then pushed the stream again.

Then I suspected that there was an interruption in the original video stream, but I watched the video through Whep and during this period, the video played normally and it seemed to have no interruption

Version v5.14.22, with this PR #193

To Reproduce Steps to reproduce the behavior:

  1. Go to transcript
  2. Click on start Transcription
  3. Go to Camera
  4. Use the latest AI Transcrition transcript live stream with subtitle url as source, and forward the AI stream to third SRT server, SRT server can use Restreamer.
  5. Click Start Live, run for a while.
  6. See error

Expected behavior AI transcription is running normally

Screenshots

image image

Additional context The Oryx logs as below, it seems that the source stream is unpublish, but after a while the origin stream is recoverd. node1. xxxx. com is the replaced content

Oryx.txt

The Oryx_2.txt is another test, and has more logs. Oryx_2.txt

limjoe commented 2 months ago

I rolled back my Oryx container version to v5.14.19, and without my PR modify, and there is the same issues.

limjoe commented 2 months ago

And the queue ts file order maybe looks like this.

image
winlinvip commented 1 month ago

Caused by the overlay command failed when driving fix queue:

[warn] 2024/07/08 12:00:45.492859 [51][1006] transcript: task uuid=6e61fd0f-194a-49f7-803e-f09d9662028c, live=segments=0, asr=segments=1, fix=segments=3, pat=So, on that note, that minor glitch has been resolved. We have fantastic pitches so far today and a really excellent job from all of the pitchers., overlay=segments=9, config is all=true, key=56B, organization=, base=https://api.openai.com/v1, lang=en, overlay=true, forceStyle=Alignment=2,MarginV=20, videoCodecParams=-c:v h264_nvenc -profile:v main -preset:v medium -bf 0, webvtt=true, webvttCueStyle=STYLE
::cue { background-color: blue; color: FFFFFF; font-family: Helvetica; font-size: 15px; }, webvttCueSetting=line:90%  align:center drive fix queue err exit status 1
transcode [-i transcript/355-org-aebe00dc-897f-46d6-9e31-b43c86c9ac3a.ts -vf subtitles=transcript/355-audio-f939a467-5e69-4d1e-b8e2-9a573c85ceed.srt:force_style='Alignment=2,MarginV=20' -c:v h264_nvenc -profile:v main -preset:v medium -bf 0 -c:a aac -copyts -y transcript/355-overlay-47c37aac-5533-44fb-8e8f-43a5c4bace8e.ts]
main.(*TranscriptTask).DriveFixQueue
 /g/platform/transcript.go:1998
main.(*TranscriptWorker).Start.func7
 /g/platform/transcript.go:1174
runtime.goexit
 /usr/local/go/src/runtime/asm_amd64.s:1571

If failed, it will retry forever, stuck the whole queue.

limjoe commented 1 month ago

the failed ffmpeg command log is

root@20e03e30a247:/usr/local/oryx/platform/containers/data# ffmpeg -i transcript/1194-org-ecea8993-3b9c-4ec4-a82d-6d199857f28c.ts -vf subtitles=transcript/1194-audio-1b79ae23-c68e-4096-a5a0-62d0ea3eb403.srt:force_style='Alignment=2,MarginV=20' -c:v libx264 -profile:v main -preset:v medium -tune zerolatency -bf 0 -c:a aac -copyts -y transcript/1194-overlay-8e3364c7-c463-450f-b7a9-be5e8df22dfd.ts
ffmpeg version 5.0.2 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
  configuration: --enable-pthreads --extra-libs=-lpthread --pkg-config-flags=--static --disable-shared --enable-static --enable-gpl --enable-nonfree --enable-postproc --enable-bzlib --enable-zlib --enable-libx264 --enable-libmp3lame --enable-libfdk-aac --enable-libxml2 --enable-demuxer=dash --enable-openssl --enable-protocol=tls --enable-protocol=rtmps --enable-filter=drawtext --enable-libfreetype --enable-libfontconfig --enable-libass --enable-libsrt --enable-nvdec --enable-nvenc --enable-cuda --enable-cuvid --enable-cuda-nvcc --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64
  libavutil      57. 17.100 / 57. 17.100
  libavcodec     59. 18.100 / 59. 18.100
  libavformat    59. 16.100 / 59. 16.100
  libavdevice    59.  4.100 / 59.  4.100
  libavfilter     8. 24.100 /  8. 24.100
  libswscale      6.  4.100 /  6.  4.100
  libswresample   4.  3.100 /  4.  3.100
  libpostproc    56.  3.100 / 56.  3.100
[h264 @ 0x55d3c525d380] illegal POC type 14
[h264 @ 0x55d3c525d380] non-existing PPS 2 referenced
[h264 @ 0x55d3c525d380] missing picture in access unit with size 21319
[extract_extradata @ 0x55d3c52b4440] Invalid NAL unit 0, skipping.
    Last message repeated 11 times
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 11 times
[h264 @ 0x55d3c525d380] non-existing PPS 2 referenced
[h264 @ 0x55d3c525d380] decode_slice_header error
[h264 @ 0x55d3c525d380] no frame!
[extract_extradata @ 0x55d3c52b4440] Invalid NAL unit 0, skipping.
    Last message repeated 6 times
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 6 times
[h264 @ 0x55d3c525d380] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c525d380] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] non-existing PPS 3 referenced
[extract_extradata @ 0x55d3c52b4440] Invalid NAL unit 0, skipping.
    Last message repeated 2 times
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 2 times
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] non-existing PPS 1 referenced
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 3 times
[h264 @ 0x55d3c525d380] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c525d380] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] time_scale/num_units_in_tick invalid or unsupported (0/3745513472)
[h264 @ 0x55d3c525d380] Overread VUI by 8 bits
[h264 @ 0x55d3c525d380] FMO is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c525d380] missing picture in access unit with size 92262
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 18 times
[h264 @ 0x55d3c525d380] sps_id 2 out of range
[h264 @ 0x55d3c525d380] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c525d380] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] SEI type 48 size 1712 truncated at 176
[h264 @ 0x55d3c525d380] missing picture in access unit with size 4945
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 1 times
[h264 @ 0x55d3c525d380] SEI type 48 size 1712 truncated at 147
[h264 @ 0x55d3c525d380] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c525d380] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] missing picture in access unit with size 24043
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 7 times
[h264 @ 0x55d3c525d380] sps_id 2 out of range
[h264 @ 0x55d3c525d380] no frame!
[h264 @ 0x55d3c525d380] missing picture in access unit with size 1441
[h264 @ 0x55d3c525d380] Invalid NAL unit 0, skipping.
    Last message repeated 2 times
[h264 @ 0x55d3c525d380] no frame!
[mpegts @ 0x55d3c5256740] Could not find codec parameters for stream 1 (Video: h264 ([27][0][0][0] / 0x001B), none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mpegts, from 'transcript/1194-org-ecea8993-3b9c-4ec4-a82d-6d199857f28c.ts':
  Duration: 00:00:11.35, start: 0.683000, bitrate: 184 kb/s
  Program 1 
  Stream #0:0[0x101]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 130 kb/s
  Stream #0:1[0x100]: Video: h264 ([27][0][0][0] / 0x001B), none, 23.50 tbr, 90k tbn
Stream mapping:
  Stream #0:1 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:0 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[NULL @ 0x55d3c525d380] illegal POC type 14
[NULL @ 0x55d3c525d380] non-existing PPS 2 referenced
[NULL @ 0x55d3c525d380] missing picture in access unit with size 21319
[h264 @ 0x55d3c5302ac0] Invalid NAL unit 0, skipping.
    Last message repeated 11 times
[h264 @ 0x55d3c5302ac0] non-existing PPS 2 referenced
[h264 @ 0x55d3c5302ac0] decode_slice_header error
[h264 @ 0x55d3c5302ac0] no frame!
[NULL @ 0x55d3c525d380] non-existing PPS 3 referenced
[h264 @ 0x55d3c529c180] Invalid NAL unit 0, skipping.
    Last message repeated 6 times
[h264 @ 0x55d3c529c180] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c529c180] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c529c180] no frame!
[h264 @ 0x55d3c52ae940] Invalid NAL unit 0, skipping.
    Last message repeated 2 times
[h264 @ 0x55d3c52ae940] no frame!
[NULL @ 0x55d3c525d380] non-existing PPS 1 referenced
[NULL @ 0x55d3c525d380] time_scale/num_units_in_tick invalid or unsupported (0/3745513472)
[NULL @ 0x55d3c525d380] Overread VUI by 8 bits
[NULL @ 0x55d3c525d380] FMO is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[NULL @ 0x55d3c525d380] missing picture in access unit with size 92262
[h264 @ 0x55d3c529f980] Invalid NAL unit 0, skipping.
    Last message repeated 3 times
[h264 @ 0x55d3c529f980] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c529f980] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c529f980] no frame!
[h264 @ 0x55d3c542a680] Invalid NAL unit 0, skipping.
    Last message repeated 18 times
[h264 @ 0x55d3c542a680] sps_id 2 out of range
[h264 @ 0x55d3c542a680] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c542a680] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c542a680] no frame!
[NULL @ 0x55d3c525d380] SEI type 48 size 1712 truncated at 176
[NULL @ 0x55d3c525d380] missing picture in access unit with size 4945
[h264 @ 0x55d3c5361800] Invalid NAL unit 0, skipping.
    Last message repeated 1 times
[h264 @ 0x55d3c5361800] SEI type 48 size 1712 truncated at 147
[h264 @ 0x55d3c5361800] data partitioning is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[h264 @ 0x55d3c5361800] If you want to help, upload a sample of this file to https://streams.videolan.org/upload/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
[h264 @ 0x55d3c5361800] no frame!
[NULL @ 0x55d3c525d380] missing picture in access unit with size 24043
[NULL @ 0x55d3c525d380] missing picture in access unit with size 1441
[h264 @ 0x55d3c5377000] Invalid NAL unit 0, skipping.
    Last message repeated 7 times
[h264 @ 0x55d3c5377000] sps_id 2 out of range
[h264 @ 0x55d3c5377000] no frame!
[h264 @ 0x55d3c538b400] Invalid NAL unit 0, skipping.
    Last message repeated 2 times
[h264 @ 0x55d3c538b400] no frame!
Error while decoding stream #0:1: Invalid data found when processing input
    Last message repeated 7 times
Cannot determine format of input stream 0:1 after EOF
Error marking filters as finished
[aac @ 0x55d3c52fbf40] Qavg: 2861.542
[aac @ 0x55d3c52fbf40] 2 frames left in the queue on closing
Conversion failed!
limjoe commented 1 month ago

it seems that if one ffmpeg command failed, then this failed ffmpeg will keep trying again, which will cause the entire queue to appear to not continue running

limjoe commented 1 month ago

There is no problem with streaming with vMix and OBS

I found that when AI subtitle stream, such as live/livestream, is pushed back to the same Oryx ai/livestream channel, AI transcription will use the slices of the ai/livestream channel for translation. During the operation, some of the ai/livestream video slices will cause ffmpeg to fail, leading to this problem