gpac / gpac

GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery
https://gpac.io
GNU Lesser General Public License v2.1
2.72k stars 530 forks source link

[Q] Support for CTA-608/708 captioning in SEI messages #2693

Closed bbgdzxng1 closed 3 months ago

bbgdzxng1 commented 10 months ago

I hope to add a reference within an MP4/ISOBMFF file to a CTA-608/708 caption track contained in the video SEI side data, in accordance with ISO/IEC 14496-30:2018/Amendment 1:2022, Part 30: Timed text and other visual overlays in ISOBMFF.

CMAF ISO/IEC 23000-19 states that ISOBMFF ISO/IEC 14496-30 Amendment 1 is required.

Signaling the presence of CEA-608/708 in SEI messages. The presence of CEA-608/708 data in SEI messages in a video track SHOULD be signaled as specified by MPEG-4 Part 30 Amendment 1.

ISO/IEC 14496-30 Amendment 1 states that referencing 608/708 data contained within the SEI data of a video track requires:

Signalling the presence of CTA-708 in SEI messages

The presence of CTA-708 data in SEI messages in a video track SHOULD be signaled by the presence of one or more Track boxes with Media Type sbtl with a Sample Entry codingname of csei and a track reference of type csei to the video track. This track SHALL NOT reference any media samples.

Each track declares the language of one of the streams in the caption data by using an Extended Language Box (elng) in the Media Data Information Box (mdia).

The CEAServiceNumberBox in the sample entry of each track identifies which service carries that language, and shall be present if more than one csei track is present. If this box is absent, service_number takes the default value of 1. The csei sample entry is derived from the PlainTextSampleEntry, with codingname equal csei. In the CEAServiceNumberBox there are two values from section 4.5 of the CEA708 specification: is_708 reflects the TYPE OF SERVICE and is 1 for 708 and 0 for 608; is_easy_reader reflects the EASY READER value and is 1 for easy reader streams and 0 otherwise.

The track header width and height give the display area for the stream; normally these are equal to the referenced video stream's width and height.

Most of the excellent MP4Box examples at https://github.com/gpac/gpac/wiki/Subtitling-with-GPAC relate to importing a subtitle track from an external file, rather than cross-referencing existing CTA-608/708 data contained in a video track's SEI messages. I have tried MP4Box's builtin -add 'self':hdlr='sbtl':lang='eng' options, but it looks like the standard command line tool does not support ISO/IEC 14496-30:2018/Amendment 1:2022.

I assume that since it is not directly supported through command-line options, I need to use a boxpatch/patch file. Assuming an MP4 containing a single video track, how can I use boxpatch/patch to add the following to the structure?

altBrand 'ccea'                     # For compatibility with CMAF, Appendix A.4/5, Table 11.  (use alternate brand -ab ?)

'sbtl' > codingName 'csei' > trackReferenceType `csei`

'tkhd' > 'width'=720 'height'=480   # tkhd width/height for 608 data in NTSC 

'mdia' > 'elng'='eng'               # Extended Language Box

'CEAServiceNumberBox'
  if 'is_708'=0                     # analogous to ATSC A/65 digital_cc=0
                                    # ATSC A/65 Line21 field number does not map to an ISOBMFF equivalent field.
  if 'is_708'=1                     # analogous to ATSC A/65 digital_cc=1
    'service_number'=1              # analogous to ATSC A/65 caption_service_number
    'is_easy_reader'=0              # analogous to ATSC A/65 easy_reader
                                    # ATSC A/65 wide_screen does not map to an ISOBMFF equivalent field

In my case, I would like to add CTA-708, service number 1, easy_reader false.

As someone inexperienced with MP4Box patch files, I hope someone more familiar with box structures can point me in the right direction to how the above quotation from ISO/IEC 14496-30:2018/Amendment 1:2022 can be mapped within an MP4Box Patch File, for compliance with both ISO/IEC 14496-30:2018/Amendment 1:2022 and CMAF Appendix A.5 Table 11.

Unfortunately, I do not have the experience to understand which of the above boxes map to which nodes in a patch file in order to produce a compliant file. I hope that readers will appreciate that I have researched and cross-referenced the standards, however I am suffering from lack of experience with MP4Box patch file structure.

$ cat "./patch.xml"

<GPACBOXES>
  <box type="hdlr" action="add">
    <!-- <box type="stbl" action="add"> -->
  </box>
  <box type="trak">
    <box type="tkhd" action="add">
      <!-- Track header fields go here -->
    </box>
    <box type="mdia">
      <box type="mdhd" action="add">
        <!-- Media header fields go here -->
      </box>
      <box type="minf">
        <box type="stbl">
          <box type="stsd" action="add">
            <!-- Sample description fields go here -->
          </box>
        </box>
      </box>
    </box>
  </box>
</GPACBOXES>
$ MP4Box -add "./infile.mp4" -patch "./patch.xml" -new "./outfile.mp4"

It seems that only a few ISOBMFF implementations utilize the standards-compliant boxes defined by ISOBMFF ISO/IEC 14496-30 Amendment 1 and CMAF ISO/IEC 23000-19:2020, Annex A.4/5, Table 11. Of course, in DASH/WAVE, it would be expected that the manifest would declare this same data to the player through the <Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015"> tag, but the Box data should still be accurate for unsegmented ISOBMFFs and standalone media segments where no manifest is guaranteed to be present.

It seems that an MP4Box Patch File would be the appropriate tool to set these boxes in the ISOBMFF.

Can the above be achieved with a MP4Box patch file? And if so, could someone point me in the right direction to at least get some of the boxes accurate. My experience is more in the MPEG-2 TS / HLS domain.

Many thanks in advance.



System Information

# MP4Box installed with brew install mp4box
# Operating System macOS 12.7.1

$ MP4Box -version

MP4Box - GPAC version 2.2.1-revrelease
(c) 2000-2022 Telecom Paris distributed under LGPL v2.1+ - http://gpac.io

Please cite our work in your research:
    GPAC Filters: https://doi.org/10.1145/3339825.3394929
    GPAC: https://doi.org/10.1145/1291233.1291452

GPAC Configuration: --disable-wx --disable-pulseaudio --prefix=/usr/local/Cellar/gpac/2.2.1_1 --mandir=/usr/local/Cellar/gpac/2.2.1_1/share/man --disable-x11
Features: GPAC_CONFIG_DARWIN GPAC_64_BITS GPAC_HAS_IPV6 GPAC_HAS_SSL GPAC_HAS_SOCK_UN GPAC_MINIMAL_ODF GPAC_HAS_QJS  
jeanlf commented 3 months ago

The text you refer to never passed ballot, and 2nd edition of 14496-30 only indicates that carriage of embedded CC is possible, but does not mandate any signaling - instead it relies on the manifest (for dash/hls) or higher level info.

So no implementation on our side for this.

regarding your question of box patching, as you noticed writing the patch for a complete track is a bit tricky. My approach would be to add an empty track (e.g. from an existing DASH init segment) and write the box patch against this new track.