Open flavioribeiro opened 10 years ago
I did some tests with my friend @bernardocamilo.
We used Apple's simple stream for our tests:
$ curl -s https://devimages.apple.com.edgekey.net/streaming/examples/bipbop_4x3/gear0/prog_index.m3u8 | head
#EXTM3U
#EXT-X-TARGETDURATION:11
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-PLAYLIST-TYPE:VOD
#EXTINF:9.98458,
fileSequence0.aac
#EXTINF:9.98458,
fileSequence1.aac
#EXTINF:9.98459,
We can see the ID3 tag is present on the first audio segment:
$ curl -s https://devimages.apple.com.edgekey.net/streaming/examples/bipbop_4x3/gear0/fileSequence0.aac | hexdump -C | head
00000000 49 44 33 04 00 00 00 00 00 3f 50 52 49 56 00 00 |ID3......?PRIV..|
00000010 00 35 00 00 63 6f 6d 2e 61 70 70 6c 65 2e 73 74 |.5..com.apple.st|
00000020 72 65 61 6d 69 6e 67 2e 74 72 61 6e 73 70 6f 72 |reaming.transpor|
00000030 74 53 74 72 65 61 6d 54 69 6d 65 73 74 61 6d 70 |tStreamTimestamp|
00000040 00 00 00 00 00 00 0d 99 f4 ff f1 5c 80 01 bf fc |...........\....|
00000050 21 20 03 40 68 1c ff f1 5c 80 16 1f fc 21 4e e7 |! .@h...\....!N.|
00000060 3f 07 e2 ed 72 05 28 32 82 40 ff e2 3f 3c b5 fc |?...r.(2.@..?<..|
00000070 fe 2e 69 fd f9 ef fa 7c ff ff be 2f 44 71 2f 40 |..i....|.../Dq/@|
00000080 2e 24 08 05 c4 81 00 b8 90 20 17 12 04 02 6c 6d |.$....... ....lm|
00000090 8b dd 5e bb f9 2f e2 c1 b3 ef e0 e9 25 a5 12 55 |..^../......%..U|
According to the specification, the header is the first 10 bytes: 49 44 33 04 00 00 00 00 00 3f
49 44 33 -> ID3v2/file identifier "ID3"
04 00 -> ID3v2 version $04 00
00 -> ID3v2 flags %abcd0000
00 00 00 3f -> ID3v2 size int 63
The next 63 bytes are the ID3v2 frame:
00000000 50 52 49 56 00 00 | PRIV..|
00000010 00 35 00 00 63 6f 6d 2e 61 70 70 6c 65 2e 73 74 |.5..com.apple.st|
00000020 72 65 61 6d 69 6e 67 2e 74 72 61 6e 73 70 6f 72 |reaming.transpor|
00000030 74 53 74 72 65 61 6d 54 69 6d 65 73 74 61 6d 70 |tStreamTimestamp|
00000040 00 00 00 00 00 09 a7 c3 c0 |......... |
Frame's header is:
50 52 49 56 -> Frame ID $xx xx xx xx (four characters) "PRIV"
00 00 00 35 -> Size 4 * %0xxxxxxx int 53
00 00 -> Flags $xx xx
According to the frames specification, the PRIV frame format is:
<Header for 'Private frame', ID: "PRIV">
Owner identifier <text string> $00
The private data <binary data>
So, the owner is com.apple.streaming.transportStreamTimestamp
and the data 00 00 00 00 00 0d 99 f4
.
This is exactly what panto's documentation says:
Elementary Audio Stream segment MUST signal the timestamp of its
first sample with an ID3 PRIV tag [ID3] at the beginning of the
segment. The ID3 PRIV owner identifier MUST be
"com.apple.streaming.transportStreamTimestamp". The ID3 payload MUST
be a 33-bit MPEG-2 Program Elementary Stream timestamp expressed as a
big-endian eight-octet number, with the upper 31 bits set to zero.
The timestamp of the other files are:
fileSequence0.aac: 00 0d 99 f4 -> 891380
fileSequence1.aac: 00 1b 50 28 -> 1789992
fileSequence2.aac: 00 29 06 5c -> 2688604
fileSequence3.aac: 00 36 bc 91 -> 3587217
The MPEG2 timestamp unit is 1/90000 second. The delta between file 1 and 0 is 898612 or 9.98458s (898612/90000), the exact duration of the first segment.
The ID3 tag is not present on ts files, so this module should generate it. That shouldn't be that difficult, since only the timestamp will vary and it should be possible to get the timestamp from the video file.
Hi @flavioribeiro. I finally implemented it in lua: https://github.com/jbochi/lua_jit_extract_audio/commit/565bd164c6123f3bd9384877a3af07e9d7ef245d
It should be easy to backport it to C now :)
great! :+1: thank you @jbochi
as per HLS spec, Each Elementary Audio Stream segment MUST signal the timestamp of its first sample with an ID3 PRIV tag at the beginning of the segment. The ID3 PRIV owner identifier MUST be "com.apple.streaming.transportStreamTimestamp".
https://ffmpeg.org/doxygen/trunk/structplaylist.html#ae271f3bc6020caa11e50ae29f9a10966 https://ffmpeg.org/doxygen/trunk/hls_8c_source.html#l01476