ietf-wg-cellar / matroska-specification

Matroska specification.
http://ietf-wg-cellar.github.io/matroska-specification
Other
121 stars 44 forks source link

Tags: specify tags for ITU-R BS.1770 programme loudness (LUFS) and true-peak level (dBTP) #831

Open bb010g opened 4 months ago

bb010g commented 4 months ago

The Matroska Media Container Tag Specifications should specify tags for programme loudness and programme true-peak level as specified by Rec. ITU-R BS.1770 (PDF) and as used by EBU R 128 (PDF). The programme loudness tag's value should be recorded in LUFS (called LKFS by the ITU), and the programme true-peak level tag's value should be recorded in dBFS. I propose the tag names ITU_BS_1770_LUFS and ITU_BS_1770_DBFS for these tags. These tags could be found at all TargetType levels (track, album, etc).

Currently, Tags specifies two tags for loudness information, REPLAYGAIN_GAIN and REPLAYGAIN_PEAK. These tags are based on the ReplayGain specification. However, ITU BS.1770 and EBU R 128 are increasingly being used in production & playback of both audio & video, so being able to encode that metadata in Matroska containers without a custom tag would be nice.

Note that loudness in LUFS is chosen instead of a gain in DB or a relative loudness in LU, relative to some target loudness, because of how target loudness levels (in LUFS) can be arbitrarily chosen. For example, EBU R 128 targets -23.0 LUFS, Qobuz targets -18 LUFS, Apple Music targets -16 LUFS, and YouTube targets -14 LUFS. REPLAYGAIN_GAIN is specified to always target 89dB SPL, but we have an opportunity to be less opinionated with ITU_BS_1770_LUFS. By being clear about this not being a relative loudness, we can hopefully avoid people changing tag values when their target changes. (Audio players can support user-specified target values for ReplayGain that aren't 89dB SPL, but specifying objective LUFS makes it more clear to implementers that target loudness should be controllable by the user.) Similarly, ITU_BS_1770_DBFS is an objective measurement that can be easily used to meet an arbitrary dBFS target by an audio player.

If it's desired to have a BS.1770-based equivalent for ReplayGain Track/Album Range tags (not appearing in this specification, but somewhat commonly used), then I propose the tag EBU_TECH_3342_LRA which would record the (programme) loudness range in LU as specified by EBU Tech 3342 (PDF). I'd like to choose a more objective measurement, but calculating reasonable loudness ranges requires decisions about sliding analysis-window length and loudness thresholds.