w3ctag / design-principles

A small-but-growing set of design principles collected by the TAG while reviewing specifications
https://w3ctag.github.io/design-principles
178 stars 47 forks source link

Considerations and pre-cautions when adding new media formats #171

Closed cynthia closed 3 years ago

cynthia commented 4 years ago

It occurred to us that adding a new format would most likely need consideration in a bunch of places on the platform, and we don't have any good guidelines on this when we were reviewing https://github.com/w3ctag/design-reviews/issues/495

Here is a brief list (there is probably more)

annevk commented 4 years ago

(FWIW, these do not apply to AVIF as far as I know.)

domenic commented 4 years ago

There's a separate-but-related question of "new ways of fetching". E.g. AVIF is a new format through an existing way of fetching (<video>), but <script type=module> is a new way of fetching.

For new ways of fetching best practices these days require:

torgo commented 4 years ago

Discussed in a breakout today. We need to come up with a list of places in the web platform that are implicated when a new media type is added. E.g. canvas...

cynthia commented 4 years ago

Due to the large backlog of unlanded (+large) PRs we have, I'm pasting this first draft as a comment.

It feels like the point that @domenic brought up should be a separate principle, which somewhat relates to the issue raised here: https://github.com/w3ctag/design-principles/issues/157 (which we closed, because we don't quite know what to write just yet - not enough antipatterns to work with. Thoughts?)

(First draft doesn't mention ORB as it seems like it's still WIP, working on incorporating good practices from that but with simpler language.)


New Data Formats

Always define a corresponding MIME type and extend existing APIs to support this type for any new data format.

There are cases when a new capability on the web involves adding a new data format. This can be an image, video, audio, or any other type of data that a browser is expected to ingest. New formats should have a standardized MIME type, which is strictly validated. While legacy media formats do not always have strict enforcement for MIME types (and sometimes rely on peeking at headers, to workaround this), this is mostly for legacy compatibility reasons and should not be expected or implemented for new formats.

It is expected that spec authors also integrate the new format to existing APIs, so that they are whitelisted in both ingress (e.g. decoding from a ReadableStream) and egress (e.g. encoding to a WriteableStream) points from a browser's perspective.

For example. if you are to add an image format to the web platform, first add a new MIME type for the format. After this, you would naturally add a decoder (and presumably an encoder) for said image format to support decoding in HTMLImageElements. On top of this, you are also expected to add support to egress points such as HTMLCanvasElement.toBlob() and HTMLCanvasElement.toDataURL().

torgo commented 4 years ago
  • Enforcing MIME types on the response if the file signature is different from established media formats to not make CORB / https://github.com/annevk/orb worse.

Hi @annevk – can you let us know what the status / trajectory of the linked doc is? Is this going to land in HTML at some point?

alice commented 4 years ago

Re the draft comment: it doesn't include the added MIME type for the hypothetical new image format?

annevk commented 4 years ago

@torgo it should probably become part of Fetch as an extension of the same-origin policy. I hope that we can make progress on it in 2021.

cynthia commented 4 years ago

@annevk I think we should move away from adding more stuff to the pattern matching algorithms (e.g. https://mimesniff.spec.whatwg.org/#image-type-pattern-matching-algorithm) in best practices, and enforce strict MIME types for newer formats. Do you think that's going too far?

annevk commented 4 years ago

I don't and I agree that's what we should do. The caveat is that new media formats (e.g., AVIF) can reuse container formats in which case we might as well continue sniffing as the pattern is the same. Bit of a delicate line to balance though and requires careful review, but fortunately it does not happen that often. (Part of what might help here long term is better testing for MIME Sniffing.)