Closed VoidXH closed 1 year ago
@ValZapod, may I ask for your help here? For Edge of Tomorrow, FFmpeg actually has extra silence (about 4500 samples of it) prepending the audio. Where is this coming from?
The entire track is delayed by about 3 AC-3 frames. This is an E-AC-3 + JOC track. Making a copy to an E-AC-3 track only removed the 1536 samples of priming, but 3000 samples of silence of unknown origin still remains.
Yes, EAE.EXE USES 768 samples as priming.
I was talking about 3 frames (4608 samples), not just 3 blocks (768 samples). That's not relevant here. When Matroska has no additional codec delay, the default priming of 1536 samples is used. That is also the case here, as the 4608 samples of delay was reduced to 3072 after a track copy, but this silence is still present.
There may be an editlist (or equivalent for mkv) for video too.
Why is it present in the audio after extracting it? This silence even had to be encoded into the track, does FFmpeg do this if you export the audio of a track with video delay, with -c copy?
Removed? You mean added back into decoded sound.
No, removed. There was 3 frames of delay, now it's only 2.
It really looks like a video delay, that track is way longer, thanks.
Stream #0:0: Video: hevc (Main 10), yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 3840x1600 [SAR 1:1 DAR 12:5], 23.98 fps, 23.98 tbr, 1k tbn (default)
Metadata:
BPS : 8018868
DURATION : 01:53:27.801000000
NUMBER_OF_FRAMES: 163224
NUMBER_OF_BYTES : 6823857865
_STATISTICS_WRITING_APP: mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
_STATISTICS_WRITING_DATE_UTC: 2022-09-05 20:42:02
_STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
Stream #0:1(eng): Audio: eac3, 48000 Hz, 5.1(side), fltp, 576 kb/s (default)
Metadata:
title : English DDP Atmos 5.1
BPS : 576000
DURATION : 01:53:26.720000000
NUMBER_OF_FRAMES: 212710
NUMBER_OF_BYTES : 490083840
_STATISTICS_WRITING_APP: mkvmerge v70.0.0 ('Caught A Lite Sneeze') 64-bit
_STATISTICS_WRITING_DATE_UTC: 2022-09-05 20:42:02
_STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
EAC3 frame is 256 samples.
That's a block. Dolby and FFmpeg both call the 6-block sized superblock a frame. This is the same in the specification.
@ValZapod For all content I could find, PTS = DTS = 0, timestamps started at 0. Checking the EBML, this is correct, the frames had no offset. What EBML entry should I check for time alignment, where else could this extra 3000 samples of silence come from?
This isn't just affecting that movie. While many are fine, Cocaine Bear for example has 512 samples of added delay instead of 3072. If there's no other way to offset a file, I might just have to close this issue and just work on the muxer.
MKV audio tracks have a CodecDelay field, which might contain the optional E-AC-3 priming that's rarely present in content. This will fix audio sync in these cases.