gpac / gpac

GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery
https://gpac.io
GNU Lesser General Public License v2.1
2.77k stars 532 forks source link

dash segment fragment(SIDX , mdat) duration does not match, drm changes duration even more #2416

Closed Murmur closed 1 year ago

Murmur commented 1 year ago

(latest mp4box from git) Dash segments don't have an equal duration in SIDX.duration and mdat.duration(samples*dur) fields. Should SIDX and mdat fragment chunk durations match?

Things go even more wrong after encryption, first cenc/v1/1.m4s file is SIDX.dur=3.76s, mdat=3.84s !!! values.

Encoding target is 3,84s segments (input aac 48Khz/h264 25fps, gop 96 frames), this should give 100% align video+audio segment durations.

4x fragments, no drm and drm

https://refapp.hbbtv.org/videos/dashtest/test1/manifest.mpd
v1/1.m4s, SIDX.earliestPresTime=0, timescale=12800
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)

v1/2.m4s SIDX.earliestPresTime=49152(3,84s), timescale=12800
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)

https://refapp.hbbtv.org/videos/dashtest/test1/cenc/manifest.mpd
https://test.playready.microsoft.com/service/rightsmanager.asmx?cfg=(kid:header,sl:2000,persist:false,contentkey:EjQSNBI0EjQSNBI0EjQSNw==)
cenc/v1/1.m4s SIDX.earliestPresTime=0, timescale=12800
SIDX.dur=       11776,        12288,        12288,        11776, total=48128 (3,76s) !!!
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s) !!!

cenc/v1/2.m4s SIDX.earliestPresTime=48128(3,76s)
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)

1x fragment, no drm, drm

https://refapp.hbbtv.org/videos/dashtest/test3/manifest.mpd
v1/1.m4s, SIDX.earliestPresTime=0, timescale=12800
SIDX.dur=       49152 (3,84s)
mdat.dur=96*512=49152 (3,84s)
v1/2.m4s, SIDX.earliestPresTime=49152
SIDX.dur=       49152 (3,84s)
mdat.dur=96*512=49152 (3,84s)

** 1x fragment, drm, test3/cenc/manifest.mpd **
https://refapp.hbbtv.org/videos/dashtest/test3/cenc/manifest.mpd
https://test.playready.microsoft.com/service/rightsmanager.asmx?cfg=(kid:header,sl:2000,persist:false,contentkey:EjQSNBI0EjQSNBI0EjQSNw==)
cenc/v1/1.m4s, SIDX.earliestPresTime=0, , timescale=12800
SIDX.dur=       48128 (3,76s)  !!!
mdat.dur=96*512=49152 (3,84s) !!!
cenc/v1/2.m4s, SIDX.earliestPresTime=48128(3,76s)
SIDX.dur=       49152 (3,84s)
mdat.dur=96*512=49152 (3,84s)

intermediary temp-xx.mp4 input files https://refapp.hbbtv.org/videos/dashtest/test1/temp/temp-v1.mp4 https://refapp.hbbtv.org/videos/dashtest/test1/temp/temp-a1.mp4 https://refapp.hbbtv.org/videos/dashtest/test3/temp/temp-v1.mp4 https://refapp.hbbtv.org/videos/dashtest/test3/temp/temp-a1.mp4

Dash command lines

4x fragments
MP4Box -dash 184320 -frag  46080 -dash-scale 48000 -mem-frags -rap -profile dashavc264:live \
  -profile-ext urn:hbbtv:dash:profile:isoff-live:2012 -min-buffer 2000 \
  -bs-switching no -sample-groups-traf -single-traf --tfdt64 --tfdt_traf --noroll=yes --btrt=no \
  --truns_first=yes --cmaf=cmf2 -subsegs-per-sidx 0 \
  -segment-name "$RepresentationID$/$Number$$Init=i$" \
  -out manifest.mpd:dual \
  "temp/temp-v1.mp4#trackID=1:id=v1:period=p0:asID=1:role=main:dur=24:#HLSPL=manifest_v1.m3u8" \
  "temp/temp-a1.mp4#trackID=1:id=a1:period=p0:asID=21:role=main:dur=24:#HLSPL=manifest_a1.m3u8:#HLSGroup=audio"

1x fragment
MP4Box -dash 184320 -frag 184320 -dash-scale 48000 -mem-frags -rap -profile dashavc264:live \
  -profile-ext urn:hbbtv:dash:profile:isoff-live:2012 -min-buffer 2000 \
  -bs-switching no -sample-groups-traf -single-traf --tfdt64 --tfdt_traf --noroll=yes --btrt=no \
  --truns_first=yes --cmaf=cmf2 -subsegs-per-sidx 0 \
  -segment-name "$RepresentationID$/$Number$$Init=i$" \
  -out manifest.mpd:dual \
  "temp/temp-v1.mp4#trackID=1:id=v1:period=p0:asID=1:role=main:dur=24:#HLSPL=manifest_v1.m3u8" \
  "temp/temp-a1.mp4#trackID=1:id=a1:period=p0:asID=21:role=main:dur=24:#HLSPL=manifest_a1.m3u8:#HLSGroup=audio"

Encryption
MP4Box -crypt drm-cenc.xml temp/temp-v1.mp4 -out temp/temp-v1-cenc.mp4
MP4Box -crypt drm-cenc.xml temp/temp-a1.mp4 -out temp/temp-a1-cenc.mp4
- run 4x and 1x fragment mp4box with temp-XX-cenc.mp4 input files

drm-cenc.xml

<?xml version="1.0" encoding="UTF-8" ?>
<GPACDRM type="CENC AES-CTR">
<!-- 
  kid=0x43215678123412341234123412341237
  key=0x12341234123412341234123412341237
  iv=0x22ee7d4745d3a26a
--> 

<!-- Playready -->
<DRMInfo type="pssh" version="0">
  <BS ID128="9A04F07998404286AB92E65BE0885F95"/>
  <BS bits="32" endian="little" value="518"/>
  <BS bits="16" endian="little" value="1"/>
  <BS bits="16" endian="little" value="1"/>
  <BS bits="16" endian="little" value="508"/>
  <BS data64="PABXAFIATQBIAEUAQQBEAEUAUgAgAHgAbQBsAG4AcwA9ACIAaAB0AHQAcAA6AC8ALwBzAGMAaABlAG0AYQBzAC4AbQBpAGMAcgBvAHMAbwBmAHQALgBjAG8AbQAvAEQAUgBNAC8AMgAwADAANwAvADAAMwAvAFAAbABhAHkAUgBlAGEAZAB5AEgAZQBhAGQAZQByACIAIAB2AGUAcgBzAGkAbwBuAD0AIgA0AC4AMAAuADAALgAwACIAPgA8AEQAQQBUAEEAPgA8AFAAUgBPAFQARQBDAFQASQBOAEYATwA+ADwASwBFAFkATABFAE4APgAxADYAPAAvAEsARQBZAEwARQBOAD4APABBAEwARwBJAEQAPgBBAEUAUwBDAFQAUgA8AC8AQQBMAEcASQBEAD4APAAvAFAAUgBPAFQARQBDAFQASQBOAEYATwA+ADwASwBJAEQAPgBlAEYAWQBoAFEAegBRAFMATgBCAEkAUwBOAEIASQAwAEUAagBRAFMATgB3AD0APQA8AC8ASwBJAEQAPgA8AEMASABFAEMASwBTAFUATQA+AHIAbQBpAFAAYwB6AHYAQwBXAHQAMAA9ADwALwBDAEgARQBDAEsAUwBVAE0APgA8AC8ARABBAFQAQQA+ADwALwBXAFIATQBIAEUAQQBEAEUAUgA+AA=="/>
</DRMInfo>

<!-- Widevine -->
<DRMInfo type="pssh" version="0">
  <BS ID128="EDEF8BA979D64ACEA3C827DCD51D21ED"/>
  <BS data="0x08011210"/>
  <BS ID128="43215678123412341234123412341237"/>
</DRMInfo>

<!-- Marlin -->
<DRMInfo type="pssh" version="0">
  <BS ID128="69f908af481646ea910ccd5dcccb0a3a"/>
  <BS data="0x000000186d61726c000000106d6b69640000000000000000"/>
</DRMInfo>

<!-- CENC -->
<DRMInfo type="pssh" version="1">
  <BS ID128="1077efecc0b24d02ace33c1e52e2fb4b"/>
  <BS bits="32" value="1"/>
  <BS ID128="43215678123412341234123412341237"/>
</DRMInfo>

<CrypTrack trackID="1" IsEncrypted="1" IV_size="8" first_IV="0x22ee7d4745d3a26a" saiSavedBox="senc"   >
  <key KID="0x43215678123412341234123412341237" value="0x12341234123412341234123412341237"/>
</CrypTrack>
</GPACDRM>

ps: #2377 ticket is loosely related to a timing, SIDX.earliestPresentationTime value was fixed.

Murmur commented 1 year ago

It seems --cmaf=cmfc writes 3.84s segments with drm or without any, files have a correct total sum of SIDX.duration=3,84s. Still happening mdat.dur=24*512 may not match to the duration field in SIDX fragment but I am probably missing some small detail(?).

Timing may relate to negctts handling? cmf2: trun.version=1, negctts(negative composite offset allowed) cmfc: trun.version=0, no negctts being used.

MP4Box -dash 184320 -frag  46080 -dash-scale 48000 -mem-frags -rap -profile dashavc264:live \
  -profile-ext urn:hbbtv:dash:profile:isoff-live:2012 -min-buffer 2000 \
  -bs-switching no -sample-groups-traf -single-traf --tfdt64 --tfdt_traf --noroll=yes --btrt=no \
  --truns_first=yes --cmaf=cmfc -subsegs-per-sidx 0 \
  -segment-name "$RepresentationID$/$Number$$Init=i$" \
  -out manifest.mpd:dual \
  "temp/temp-v1.mp4#trackID=1:id=v1:period=p0:asID=1:role=main:dur=24:#HLSPL=manifest_v1.m3u8" \
  "temp/temp-a1.mp4#trackID=1:id=a1:period=p0:asID=21:role=main:dur=24:#HLSPL=manifest_a1.m3u8:#HLSGroup=audio"

What is the real reason to choose cmfc or cmf2 profiles?

Can I control trun.version=1 using a cmdline --negctts=yes flag or similar, could run without cmaf profiles to see how things go with various use cases?

Murmur commented 1 year ago

I think have found a partial fix, need to specify :negctts flag in a crypt cmd. Dashing can use cmf2, cmfc profiles and include SIDX box in a segment file. Total duration of sidx and sidx.earliestPresentationTime looks good in 1..n.m4s segment files.

Still sidx.duration fragment does not match to mdat.duration fragment.

MP4Box -crypt drm-cenc.xml temp/temp-v1.mp4 -out temp/temp-v1-cenc.mp4:negctts

MP4Box -dash 184320 -frag 46080 -dash-scale 48000 -mem-frags -rap -profile dashavc264:live \
  -profile-ext urn:hbbtv:dash:profile:isoff-live:2012 -min-buffer 2000 \
  -bs-switching no -sample-groups-traf -single-traf --tfdt64 --tfdt_traf --btrt=no \
  --truns_first=yes --cmaf=cmf2 -subsegs-per-sidx 0 \
  -segment-name "$RepresentationID$/$Number$$Init=i$" \
  -out manifest.mpd \
  "temp/temp-v1-cenc.mp4#trackID=1:id=v1:period=p0:asID=1:role=main" \
  "temp/temp-a1-cenc.mp4#trackID=1:id=a1:period=p0:asID=21:role=main"

**NEW cenc segments, :negctts flag was used in a crypt command (ok)**
- 1.m4s and 2.m4s sidx total duration is correct
- still sidx.duration fragment does not match to mdat.duration fragment !!!
cenc/v1/1.m4s SIDX.earliestPresTime=0, timescale=12800, trun.version=1
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)  !!!
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)  !!!
cenc/v1/2.m4s SIDX.earliestPresTime=49152(3,84s)  !!!
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)

**OLD cenc segments, no :negctts flag in a crypt command (incorrect)**
- 1.m4s SIDX.dur is wrong, wrong value goes to 2.m4s sidx.earlisetPresentationTime field.
cenc/v1/1.m4s SIDX.earliestPresTime=0, timescale=12800, , trun.version=1
SIDX.dur=       11776,        12288,        12288,        11776, total=48128 (3,76s) !!!
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s) !!!
cenc/v1/2.m4s SIDX.earliestPresTime=48128(3,76s) !!!
SIDX.dur=       12800,        12288,        12288,        11776, total=49152 (3,84s)
mdat.dur=24*512=12288, 24*512=12288, 24*512=12288, 24*512=12288, total=49152 (3,84s)
jeanlf commented 1 year ago

We indeed had a bug when moving from ctts v0 to v1, source delay was not ignored when computing sidx, now fixed thanks for the report.

Regarding fragment durations, this is the expected behaviour: mdat.duration is the sum of durations of all samples in the fragment, but sidx.duration is NOT, it is the diff between the min cts in the next sidx entry (or in next segment) and the min cts in the fragment. Depending on the gop structure, this can change the fragment duration.

Murmur commented 1 year ago

Thanks, all looking good. I was not aware of sidx.dur|mdat.dur does not follow the same calculation. Using fps=25, segdur=3.84s, gop=1.92s, frags=2 | aac 48Khz writes two video fragments(dur=24576, scale=12800, both sidx rap=1) and sidx.dur=mdat.dur match in a segment file.