w3c / media-source

Media Source Extensions
https://w3c.github.io/media-source/
Other
268 stars 57 forks source link

Fix #180, include vp09... codecs parameter string #182

Closed wolenetz closed 6 years ago

wolenetz commented 7 years ago

@jdsmith3000 Please review. Bulk of text in the update comes from @chcunningham, who also has an associated web-platform-tests update pending this spec change, and Chromium is implementing the updated isTypeSupported() and addSourceBuffer() support to include the vp09... codec parameter string format.

@plehegar Please comment on the correct process for updating MSE byte stream specs. (This PR, IIUC, only updates the github.io editor's draft; the TR Note is not updated by this. Should we also update the Note?) Also, note that respec regeneration automatically updated SOTD text, including especially updating the governing document from Sept 2015 to Mar 2017 -- is this ok?

@paulbrucecotton, @mwatson2 FYI

To assist review, 2 w3c diffs: 1) This PR versus the currently published Note: http://services.w3.org/htmldiff?doc1=http%3A%2F%2Fwww.w3.org%2FTR%2Fmse-byte-stream-format-webm%2F&doc2=http%3A%2F%2Frawgit.com%2Fwolenetz%2Fmedia-source%2Ffix_180%2Fwebm-byte-stream-format.html 2) This PR versus the (older than the Note) editor's draft: http://services.w3.org/htmldiff?doc1=https%3A%2F%2Fw3c.github.io%2Fmedia-source%2Fwebm-byte-stream-format.html&doc2=http%3A%2F%2Frawgit.com%2Fwolenetz%2Fmedia-source%2Ffix_180%2Fwebm-byte-stream-format.html

wolenetz commented 7 years ago

Also, @frankgalligan and @foolip FYI

jdsmith3000 commented 7 years ago

I believe it's been common for implementations to ignore information in the codec string beyond the first decimal IF it was unrecognized. The decimal encodings of video capabilities for VP09... assumes strict responses on all requirements. That can be accomplished going forward, but existing UA behavior will take time to update.

jdsmith3000 commented 7 years ago

On second thought, this is contained clearly by the "VP09" primary codec name. A response to it cannot be "supported" unless the details of the string are interpreted and supported as well.

plehegar commented 7 years ago

re process, it's a simple update for the Note. We can trigger that at will as long as Paul is ok with it.

wolenetz commented 7 years ago

@jdsmith3000 (https://github.com/w3c/media-source/pull/182#issuecomment-305918353), IIUC, there is no objection from you, right? Please mark this PR approved too (unless I misunderstand and you have a concern still.) Thanks!

jdsmith3000 commented 7 years ago

@wolenetz I do have questions about whether it is wise to express format attributes in decimal fields of the codec string. This is the first I've heard of this approach. What do you know about it's background?

wolenetz commented 7 years ago

@jdsmith3000, @chcunningham can comment further. Also, from the referenced document (http://www.webmproject.org/vp9/mp4/#codecs-parameter-string), Kilroy Hughes (@ MSFT) may also be able to assist resolving your concerns.

chcunningham commented 7 years ago

@FrankGalligan, @tinkskip for more authoritative answer

I do have questions about whether it is wise to express format attributes in decimal fields of the codec string.

I think this trend starts with AVC, which used just a single decimal to prefix its profile and level info (e.g. avc3.42C01E). HEVC's codec strings add more fields and more decimals (e.g. hvc1.1.6.L93.B0).

If you're more concerned about the value/meaning of these extra fields, I know chrome is already finding these valuable to make make the is-supported decision. For instance, chrome will reject support for wide gamut eotfs and color spaces under certain conditions (missing/experimental support).

jdsmith3000 commented 7 years ago

I'm thinking ahead to how this approach will integrate with the WICG effort to expose media capabilities directly (Media Capabilities API). The VP09 approach can work with today's capabilities queries though, and perhaps aren't exclusive. Should the Media Capabilities API express the same attributes, then we might transition to them?

chcunningham commented 7 years ago

Agree, the two approaches are not exclusive. There isn't a plan at this point to transition MediaSource.isTypeSupported -> MediaCapabilities. Will probably consider this more as the incubation progresses, but I would expect the two APIs to exist side-by-side for some time.

MediaCapabilities will require users to use this new "vp09..." string, because the other string is too ambiguous about profile information to give a firm capabilities answer.

Not all codec strings will provide these details about EOTF and color (e.g. HEVC's string), so Media Capabilities is likely to have some separate mechanism for optionally querying these things. For vp09, Media Capabilities users will have the option of just specifying via the codec string. If they want to additionally set the EOTF and color information in the separate fields thats fine too, as long as they're consistent with whats in the string.

The mechanism for querying these things with Media Capabilities is still being decided. Can't say for sure what color info will be available for querying (some context here and here).

paulbrucecotton commented 7 years ago

@wolenetz and @plh: I have no problem with republishing this WG Note once this PR has been merged.

paulbrucecotton commented 7 years ago

Kilroy Hughes (@ MSFT) may also be able to assist resolving your concerns.

As HME WG Chair, I have obtained permission to post the following authored by Kilroy Hughes:

"Yes, I can try to help.

For instance, capability discovery needs to be improved for WAVE, and the VPx “codecs” string provides a much better solution with available canPlayType APIs than those specified for AVC, HEVC, etc.

But, additional information such as user preferences (language, accessibility, etc.), and configuration (display color volume? , surround amp/speak array?) need to be exposed to HTML5 apps in a standard way.

CMAF Media Profiles were designed to reduce hundreds of pages of encoder/decoder specs to a single 4CC code point sufficient to determine media interop in an adaptive streaming environment conforming to the CMAF object model, and players conforming to the CMAF hypothetical application model. Since it is protocol agnostic, the same content works for DASH, HLS, ATSC3, MBMS, hybrid cable, etc. delivery.

When the goal is to encode one set of content that will play on billions of devices, the encoding constraints have to be pre-determined; not re-encoded based on the low level capabilities each device reports. But selection from the available content alternatives, e.g. 5.1 channel or stereo or 3D audio, 2K or 4K video resolution, AVC or HEVC or ? codec, SDR or HDR video characteristics … requires basic device capability and user preference info available to the player app."

In addition Kilroy added:

"I support the parameter/value string format Google is using.

It addresses the application requirement I described below from groups specifying OTT video interop using HTML5 presentation apps.

In contrast, AVC and HEVC quote bitfields in the elementary stream that weren’t designed for this purpose, are huge compared to the small number of enumerations that are valid (i.e. a short list of tiers, profiles, and levels), they don’t separate each parameter, and omit several necessary parameters for interop determination, such as bit-depth, spatial subsampling, color space, transfer function, etc. The approach of hex encoding part of the bitstream started with MPEG-4 video a decade ago has been carried to an illogical extreme with AVC and HEVC, with no participation by the codec designers to optimize parameter storage in the elementary stream for the purpose of quoting it in a “codecs” string. As a result, every query comes back “maybe I can play this, but you haven’t given me enough information to know”.

In various places, we are discussing how to start over with an intentional design for the codecs string for ISOBMFF files. VPx started with a clean slate, and is a possible prototype for a redesign."

/paulc

jdsmith3000 commented 7 years ago

Two aspects of this solution seem less than ideal are:

How have these aspects been addressed in VP09?

chcunningham commented 7 years ago

The optional parameters must be added all or none to the query....

I agree on pitfalls of relative position. While there are designs that might alleviate this, I think in practice its not likely to cause much pain. The defaults are really there to shorten the string for basic uses. Sophisticated users who need to supply some non-default will likely not feel burdened by filling in other parts of the string. Also, all or none is over stating it some. Users do need to fill in any field to the left of a specified value, but fields that follow may be left defaulted if the default values match what they aim to describe.

Having said all that, feel free to submit feedback on the string to its designers. For my part, I think this string is a big step forward for improving the accuracy of MediaSource.isTypeSupported. Knowing the profile especially is a huge boost.

It doesn't seem tied to whether the format can be adequately displayed...

I think this part is up for the UA to make the call. The proposed spec text should allow room for the user agent to reject things it thinks it cant display. Happy to amend the text if you think that isn't clear. Browser support for HDR and WCG features is pretty new. Chrome's support for these is currently behind a run time flag, and Chrome will currently reject strings with color properties and eotfs that would require this flag to be supplied to look nice. This doesn't have to be strictly tied to the display however. For instance, tone-mapping of WCG content to standard colors (for non-WCG monitors) is being explored. If tone-mapped WCG content can be made to look better than equivalent standard color content, Chrome could consider signalling WCG support. Again, all very experimental at this point, but I think the UA should decide to support whatever it thinks is best for its users.

wolenetz commented 7 years ago

@jdsmith3000 @chcunningham is there an alternative approach that makes more sense than this? It seems the proposed approach has received approval from Kilroy (and the SOTD / document process updates have been approved by @plehegar). I'd like to get this change out of limbo soon :)

chcunningham commented 7 years ago

Are we in limbo? I hope my last response addresses @jdsmith3000's comments to his satisfaction :).

wolenetz commented 7 years ago

@jdsmith3000 do you have any remaining concerns over landing this change?

chcunningham commented 6 years ago

@jdsmith3000 friendly ping

jdsmith3000 commented 6 years ago

@chcunningham. Apologies, but I've been focused on other work. I want to solicit a few opinions here and then will reply. Please allow me a few days...

chcunningham commented 6 years ago

@jdsmith3000 friendly ping.

wolenetz commented 6 years ago

@jdsmith3000 Hi there :). Is there anything remaining that blocks this bytestream spec change from landing? We'd really like to get this off of our plate. Thanks in advance!

chcunningham commented 6 years ago

@jdsmith3000, the Media Capabilities API is launching in blink soon and, as mentioned above, requires usage of this new string. It would be great to get this pull landed to document compatibility and spread awareness of the new string format.

wolenetz commented 6 years ago

@jdsmith3000 @plehegar @paulbrucecotton @chcunningham - Without further objecting responses, it seems to me that this issue shouldn't remain in a blocked state. Can you assist with helping me understand if this PR is something that could be landed at this point? Note, the Media Capabilities API has been proceeding with this understanding of the vp09.... format string for a while now. It would be unfortunate if lack of documentation of the same format for MSE isTypeSupported led to developer confusion. Perhaps we could establish a timeline for any objections to be raised? What do you think? Thanks!

wolenetz commented 6 years ago

@mwatson2 Mark, can you assist by reviewing this PR please? I understand that Netflix was part of the effort resulting in the referenced document (https://www.webmproject.org/vp9/mp4/#codecs-parameter-string).

If you wish, we could also add you accordingly to mseContributors in an updated version of this PR.

wolenetz commented 6 years ago

Update: I've queried new contacts at MSFT (Angelo and John) today to take a look at this issue.

wolenetz commented 6 years ago

Note that I might need to resubmit a clone of this PR for review as I move my fork of w3c/media-source from w3c to wicg (which itself is a fork of w3c/media-source): unfortunately, github doesn't allow for me to have two simultaneous forks that share a common root repo. I'll do this over the next week probably; hopefully this PR can be approved by MSFT and we can merge it before then.

wolenetz commented 6 years ago

Since this logic is already shipping in Chrome and there has been no recent objection and no recent responses to multiple pings to both old and new MSFT contacts, I plan to land the PR this week.

wolenetz commented 6 years ago

@plehegar @paulbrucecotton It looks like I need to update the webm bytestream spec editor's draft date/time since https://github.com/w3c/media-source/commit/b1ea68d32db22048c6c7267762228e5c5b45734c was so old and I forgot to do that before merging it. Once I do that, I'll request the Note be updated so the new bytestream format spec is available via the .../TR/... URL.