w3c / media-capabilities

Media Capabilities API
https://w3c.github.io/media-capabilities/
Other
77 stars 33 forks source link

DolbyVision HDR Metadata #136

Open chcunningham opened 4 years ago

chcunningham commented 4 years ago

This is a fork for comments at the bottom of #118 .

@rdoherty0 wrote:

...it may be important to note that Dolby Vision is a superset of SMPTE 2094-10, particularly when it comes to OTT video distribution. See https://www.dolby.com/us/en/technologies/dolby-vision/dolby-vision-profiles-levels_v1.3.2.pdf

I believe this is why the vendor strings were chosen for Android: https://developer.android.com/reference/android/view/Display.HdrCapabilities.html

@jernoble replied:

@rdoherty0, could you clarify: I don’t see any reference to SMPTE 2094-10 in that document, only SMPTE 2086.

When you say “superset”, do you mean that the bitstream carries multiple metadata formats at the same time? Or that the bitstream is capable of carrying one out of a defined set of metadata formats? The “BL signal cross-compatibility ID” section seems to indicate the latter.

@rdoherty0 replied:

There is a lot to unpack here, unfortunately. Your second statement is closer to the truth: there is one complete metadata set per stream. There is more documentation from Dolby here which documents the inclusion of Dolby Vision streams into various formats (DASH, for example): https://www.dolby.com/us/en/technologies/dolby-vision/dolby-vision-for-creative-professionals.html#5

The 2094-10 metadata is used in several standards' based efforts, including ATSC and DVB, and specified in DASH-IF IOP spec. But most Dolby Vision profiles extend this metadata, including the composing metadata specified in the ETSI specification (https://www.etsi.org/deliver/etsi_gs/CCM/001_099/001/01.01.01_60/gs_CCM001v010101p.pdf), which does reference SMPTE 2094-10.

Most online distribution is using Dolby Vision profiles 5 or 8.1.

I would suggest none of this complexity needs to be exposed at this API layer, the simple existence bit as proposed is ok, but it would be not accurate to label the Dolby Vision "family" of HDR metadata as SMPTE 2094-10.

I bolded the last sentence. I think the "simple existence bit" refers to Screen.video.hdrSupported? I'm not sure. We have a separate attribute to query supported metadata types, with "smpteSt2094-10" being one of the values. Do we need a separate string to reflect the extensions dolby added to this, or is 2094-10 good enough?

Related questions: are DV's extensions are backward compatible with 2094-10? If there are devices out there that support only the 2094-10 subset, should sites still pick Dolby Vision over other metadata formats that might be fully supported?

rdoherty0 commented 4 years ago

I bolded the last sentence. I think the "simple existence bit" refers to Screen.video.hdrSupported? I'm not sure. We have a separate attribute to query supported metadata types, with "smpteSt2094-10" being one of the values. Do we need a separate string to reflect the extensions dolby added to this, or is 2094-10 good enough?

I think the precise point I'm trying to make is that "2094-10" is not good enough. Many large existing streaming services provide content using Dolby Vision that is not the same as 2094-10, and I think they would be correctly concerned setting this bit for content would not be correct or provide interoperability,

Related questions: are DV's extensions are backward compatible with 2094-10? If there are devices out there that support only the 2094-10 subset, should sites still pick Dolby Vision over other metadata formats that might be fully supported?

Yes to all these questions, some profiles of Dolby Vision are cross-compatible with 2094-10, and some are not. I do not think there are any devices today that only support only 2094-10, but it's certainly permitted and increasingly likely moving forward given the specifications in ATSC and DVB. All Dolby Vision devices should play 2094-10 content, it is a superset.

vi-dot-cpp commented 4 years ago

@chcunningham:

Do we need a separate string to reflect the extensions dolby added to this, or is 2094-10 good enough?

To @rdoherty0's point, SMPTE 2094-10 does not sufficiently represent Dolby Vision. However, Dolby Vision has registered mime types that can instead be used to query support. Thus, we don't need a new HdrMetadata string.

dva1 | AVC-based Dolby Vision derived from avc1 | Video | Dolby Vision |   dvav | AVC-based Dolby Vision derived from avc3 | Video | Dolby Vision |   dvh1 | HEVC-based Dolby Vision derived from hvc1 | Video | Dolby Vision |   dvhe | HEVC-based Dolby Vision derived from hev1 | Video | Dolby Vision |

are DV's extensions are backward compatible with 2094-10?

No, DolbyVision is not always backward compatible with 2094-10; but, the opposite is true.

should sites still pick Dolby Vision over other metadata formats that might be fully supported?

Sites should use the registered Dolby Vision mime types to query support instead of 2094-10 metadata.

vi-dot-cpp commented 4 years ago

@rdoherty0 unless you object, I will close this issue at the end of the week. Thanks everyone.

rdoherty0 commented 4 years ago

The mime types are helpful for client identification and completely workable for identified Dolby Vision streams, but are not sufficient in cases of cross-compatible DV streams. Cross-compatible streams (notably, profile 8.1 in use today) allows for a mime type that is consistent with the underlying compression codec, such as 'hev1'. This is to allow compliant HDR10 implementations to play back these compatible streams, and they are therefore not identified by the mime type. In all other scenarios the mime type can be used.

vi-dot-cpp commented 4 years ago

Does "cross-compatible" refer to 1) HDR10-DV compatibility or 2) various codec compatibility?

rdoherty0 commented 4 years ago

In practice today, it is typically HDR10-DV compatibility. But it could be others moving forward, including SDR compatibility.

vi-dot-cpp commented 4 years ago

Please correct where I am wrong, but here is my understanding on how sites may query for profile 8.1 for DV and HDR10, respectively:

if DV 8.1: query mimetype "dvhe.08.01"

else if HDR10 8.1: query HdrMetadataType "smpteSt2086" + relevant ColorGamut, TransferFunction

This presumes that Dolby Vision is a superset of SMPTE 2094-10, but the HDR10 standard is adequately defined, even for profile 8.1.

rdoherty0 commented 4 years ago

There are corner cases where you wish to specify the codec as normal hevc (e.g. 'hev1') so that a non-DV client can correctly understand the stream, and so it would be better to have an alternate additional signaling convention (such as 'DV' on the hdr capabilities). But those cases are currently not widely used, so we can close this if you wish for expediency and perhaps reexamine in the future.

kdcloudy commented 3 years ago

Dolby Vision is a very complex envelope format but all different filetypes are governed by its MIME string.

if DV 8.1: query mimetype "dvhe.08.01"

else if HDR10 8.1: query HdrMetadataType "smpteSt2086" + relevant ColorGamut, TransferFunction

This is partially correct but Dolby Vision 8.1 isn't "dvhe.08.01". It's a common misconception but upon reading Dolby's documentation you can find out that 01 is a Level ID.

dvhe.08.01 basically means a 720p at 24fps Profile 8 file with a maximum supported bitrate 20Mbps, those are Level 01's specifications.

We can use the .IsTypeSupported method for Dolby Vision MIME type strings and if returned true, regex can be used to identify resolution, bitrate and framerate from the Level ID, and instantiate a new VideoConfiguration object from there.

Right now only Safari on macOS 10.15 or later will support Dolby Vision decoding. I guess sites (Netflix) right now just use NavigatorUserAgent to determine whether to stream Dolby Vision content or not.

rgalv-Dolby commented 3 years ago

Dolby Vision is a very complex envelope format but all different filetypes are governed by its MIME string.

if DV 8.1: query mimetype "dvhe.08.01" else if HDR10 8.1: query HdrMetadataType "smpteSt2086" + relevant ColorGamut, TransferFunction

This is partially correct but Dolby Vision 8.1 isn't "dvhe.08.01". It's a common misconception but upon reading Dolby's documentation you can find out that 01 is a Level ID.

dvhe.08.01 basically means a 720p at 24fps Profile 8 file with a maximum supported bitrate 20Mbps, those are Level 01's specifications.

@kdcloudy is correct, the codec string used in a MIME type or in codec tag for HLS and DASH uses a dot notation where we have the fourCC followed by the profile and then followed by the level. You can find details about our profiles and levels here: https://dolby.force.com/professionalsupport/s/article/What-is-Dolby-Vision-Profile The confusion occurs when we refer to the different cross-compatability IDs that are part of certain profiles. Profile 8 can be cross compatible with HDR10, SDR or HLG. So we refer to the different versions as profile 8.x, this can also be seen in the PDF I linked. The cross-compatability ID is not signaled in the MIME type.

We can use the .IsTypeSupported method for Dolby Vision MIME type strings and if returned true, regex can be used to identify resolution, bitrate and framerate from the Level ID, and instantiate a new VideoConfiguration object from there.

I would recommend that you use the mediaCapabilities API to determine if Vision can be played back. I have seen that isTypeSupported will return false positives for some profiles of Dolby Vision on Safari.

Right now only Safari on macOS 10.15 or later will support Dolby Vision decoding. I guess sites (Netflix) right now just use NavigatorUserAgent to determine whether to stream Dolby Vision content or not.

Safari and Microsoft's Edge browser can support Dolby Vision on appropriate combinations of OS and hardware.