MediaArea / MediaInfo

Convenient unified display of the most relevant technical and tag data for video and audio files.
https://MediaArea.net/MediaInfo
BSD 2-Clause "Simplified" License
1.26k stars 150 forks source link

EIA-608/DTVCC captions embedded with libcaption are not detected #664

Open bbgdzxng1 opened 1 year ago

bbgdzxng1 commented 1 year ago

libcaption is one of the few open-source tools that can embed 608 captions into SEI side data. The libcaption library is used by OBS, gstreamer and several open-source streaming systems to embed EIA-608 into DTVCC H.264.

tl;dr Would someone from the mediainfo team be able to take a look at the attached and confirm whether they would expect mediainfo to be able to detect the existence of ATSC A/53 | SCTE-128 | DTVCC captions in the stream? I have prepared a test case below.

I use mediainfo to detect the existence of DTVCC/SCTE-128 captions within a stream. Unfortunately, if EIA-608 captions have been embedded into the file with libcaption, they are not being detected in the probe.

Attached is a simple sample captions file, generated from FFmpeg and with the captions embedded via libcaption's flv+srt utility.

Attached is the mediainfo output. The version of mediainfo is:

$ mediainfo --version
MediaInfo Command line, 
MediaInfoLib - v22.09

I would expect this file to contain the following, but unfortunately, it does not.

Text
ID : xxx (0xHH)-CC1
Format : EIA-608
Muxing mode : A/53 / DTVCC Transport
Muxing mode, more info : Muxed in Video #1
Bit rate mode : Constant
Stream size : 0.00 Byte (0%)

I have confirmed that this file contains EIA-608 DTVCC captions with ccextractor:

$ ccextractor -in='mp4' -out='ttxt' -1 -debug -608 -cbraw -stdout testsrc.with608captions.mp4          # Full debug mode

$ ccextractor -in='mp4' -out='ttxt' -1 -quiet -stdout testsrc.with608captions.mp4                                  # quiet mode
00:00:03,871|00:00:10,010|POP|This is to test whether
00:00:03,871|00:00:10,010|POP|mediainfo can detect EIA-608
00:00:03,871|00:00:10,010|POP|in DTVCC which have been
00:00:03,871|00:00:10,010|POP|embedded by libcaption.
00:00:20,521|00:00:28,995|POP|This is a second caption.

$ ccextractor -in='mp4' -out='report' -1 -quiet -stdout testsrc.with608captions.mp4
File: testsrc.with608captions.mp4
Stream Mode: MP4
EIA-608: Yes
XDS: No
CC1: Yes
CC2: No
CC3: No
CC4: No
CEA-708: No
MPEG-4 Timed Text: No

I have confirmed that this file contains EIA-608 DTVCC captions with FFmpeg:

$ ffmpeg -loglevel 'error' -hide_banner -f 'lavfi' -i "movie=filename=testsrc.with608captions.mp4[out+subcc]" -codec:s 'srt' -f 'srt' -
1
00:00:01,969 --> 00:00:09,977
<font face="Monospace">{\an7}This is to test whether
mediainfo can detect EIA-608
in DTVCC which have been
embedded by libcaption.</font>

2
00:00:19,987 --> 00:00:28,962
<font face="Monospace">{\an7}This is a second caption.</font>

$ ffprobe -hide_banner -show_entries side_data=side_data_type -print_format json -i testsrc.with608captions.mp4

       {
            "type": "frame",
            "side_data_list": [
                {
                    "side_data_type": "ATSC A53 Part 4 Closed Captions"
                }
            ]
        },

I have verified that the EIA-608 captions are successfully played by mpv.

I have used MP4 container format, since github does not allow the upload of TS files, however the issue is common to both MP4 and TS files.

As ever, thanks for such a great utility.

bbgdzxng1 commented 1 year ago

For reference, and in case it proves useful, here is how the test case was created...

# create a video with no captions
$ ffmpeg -hide_banner -f lavfi -i testsrc=size=640x480:rate=ntsc -codec libx264 -pix_fmt:v yuv420p -x264opts:v force-cfr=1 -force_key_frames:v 'expr:gte(t,n_forced*2.000)' -keyint_min:v 1 -b-pyramid:v strict -t 00:00:30.000 -f flv testsrc.flv -y

# embed [some captions](https://github.com/MediaArea/MediaInfo/files/10262707/subtitles.srt.zip) with libcaption's flv+srt utility
$ flv+srt testsrc.flv subtitles.srt testsrc.with608captions.flv

# remux into MP4 (or TS)
$ ffmpeg -hide_banner -i testsrc.with608captions.flv -codec:v copy -f mp4 -movflags +faststart testsrc.with608captions.mp4 -y

If any interested party would like to install libcaption (which is not necessary since a sample file is attached), here are the steps. The flv+srt and flv+scc utilities will be installed.

$ git clone https://github.com/szatmary/libcaption.git
$ cd libcaption
$ cmake .
$ make
$ make install
JeromeMartinez commented 1 year ago

608-in-708 in AVC is usually supported but only when the constant bitrate (4 bytes per frame) is respected, here there is a bunch of bytes in some frames like with a burst and other frames have no caption data. Seems unusual and VLC does not play captions too. Anyway, this is something we could support with some adaptation.

Hacked MediaInfo (not usable in production) :

Text
ID                                       : 1-CC1
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 29 s 996 ms
Start time (commands)                    : 830 ms
Start time                               : 830 ms
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
Count of frames before first event       : 57
Type of the first event                  : PopOn

(note: A/35 is for MPEG-2 Video, AVC uses SCTE 128)

Please share a TS file, zipped, (and the FLV?), so we have also it in our automatic tests.

bbgdzxng1 commented 1 year ago

@JeromeMartinez thank-you for replying, sir.

Here's where the testcase created by libcaption works and does not. With your proposed workaround,mediainfo is in with the majority.

You are correct highlighting that VLC does not play the bubs created by libcaption, and that is exactly why I turned to my trusty old friend mediainfo to try to work out why VLC was not behaving.

"but only when the constant bitrate (4 bytes per frame) is respected"

That is great intel, and explains why mediainfo was not catching it. Unfortunately, libcaption is unmaintained these days (Matt moved on from Twitch to mux.com) and unfortunately it is now a project that needs a fork/maintainer. But, since it is the main open-source 608 encoder out there, it is the one that OBS and gstreamer use. Even if libcaption is not to spec, your solution will help OBS & gstreamer users.

I'll take a look at the DTVCC specs again. I thought that the 4-byte limit per frame was for analog line21, but I understood that 608-in-708 (defined in the 708 spec) allowed more payload so long as the 9600 data rate was maintained. But I'm no developer, I'm just an end user, but your insight is truly helpful. I shall investigate further.

(note: A/35 is for MPEG-2 Video, AVC uses SCTE 128)

Noted regarding A/53 and SCTE-128. I stole the example snippet output from another file that I had to hand, it must have been an H.262/mpeg2video rather than H.264/avc. Yes, libcaption produces SCTE-128, although our friends over at FFmpeg use -a53cc true for H.264, so I am blind to the distinction these days. Apologies for my ambiguity.

Here's a zipfile with the testcases in. I'll be happy to craft a testcase if your test rig has a particular preference for size, framerate or caption text, although I doubt as much. But the offer is there.

testsrc.nocaptions.flv                          # raw file generated by FFmpeg, input to libcaption's flv+srt
subtitles.srt.txt                               # input to libcaption's flv+srt.  Renamed so that VLC & mpv do not autoplay
testsrc.withlibcaption608captions.flv           # output of libcaption's flv+srt
testsrc.withlibcaption608captions.mp4           # remux of testsrc.withlibcaption608captions.flv
testsrc.withlibcaption608captions.ts            # remux of testsrc.withlibcaption608captions.flv
master.m3u8                                     # hls wrapper for testsrc.withlibcaption608captions.ts (for testing in Quicktime)
playlist.m3u8                                   # hls wrapper for testsrc.withlibcaption608captions.ts (for testing in Quicktime)

Totally appreciate your response. Since you so quickly identified a workaround, I assume that my description and write up was clear, polite and constructive. Thanks, again.

Let me know if you need anything from me when you have the time to take a look at a production candidate - and I am in no rush, nor hold any expectations. mediainfo is a daily tool for me, both the command line and Mac App Store version and I'm very grateful for all the work that you have done over the many years.

bbgdzxng1 commented 1 year ago

Interesting snippet from 708...

4.1 DTVCC Transport Channel Data Rate

The DTVCC design allows for caption data to be transmitted at various data rates within a DTV video signal. However, it is often a continuous 9600 bits per second (bps) stream allocated from the DTV video signal capacity. NOTE—This value of 9600 bps is divided by 1.001 when the frame rate is 23.98, 29.97, or 59.94 Hz.

In order to provide such a continuous stream and to ensure that specific caption data will reach receivers as intended (in relationship to the audio and video), this data channel is allocated on a frame-by-frame basis such that 1200 bytes of data (see Note above) are transported per second. For example, for a DTV video signal with fixed-allocated 9600 bps (1200 bytes per second) captioning, a 60 Hz frame rate and no CEA-608 captions, 20 bytes of DTVCC caption data are allocated in each frame.

The allocation for CEA-608 closed-caption encoding within the DTVCC Transport Channel shall be included in the allocated DTVCC Caption Channel bandwidth. That is, when the total DTVCC Transport Channel (consisting of the DTVCC Caption Channel and the CEA-608 Caption Channel) is 9600 bps, on average, a CEA-608 datastream is allocated 960/1.001 bps, and DTVCC captions are allocated at least 8640/1.001 bps.

So within the 20 bytes, the available bandwidth needs to carry:

The 'pooled' bandwidth seems more permissive than the old analog line 21 608s being restricted to being able to physically carry only 4 bytes per video frame.

This seems to allow for 'burstiness' in the DTVCC transmission. But I would value your opinion in this matter.