Scenario: video playback

tomByrer commented 7 years ago

One of the frequent uses of an iframe is for video (any have time for an HTTPArchive query?). So what happens if someone wants a higher quality video playback? Of if they travel to another 'suggested' video afterwards?

jkarlin commented 7 years ago

Sounds like it may not be a good use-case for size policy. Generally size-policy would be used for content that you can imagine an upper bound for.

igrigorik commented 7 years ago

Hmm, this is interesting because it hints that we'd be excluding certain types of fetches from global quota.. and conversely, perhaps it makes sense to set different limits for different types of content?

E.g. global limit is 5MB, and within that my budget is ~3MB for images, ~500KB for JS, etc?

jkarlin commented 7 years ago

Can you give a more specific use case for that?

igrigorik commented 7 years ago

I'm embedding a widget and I want to allow at most X KB of "creative" content within it, and up to Y KB of JavaScript. I don't want my creative content to consume and block my javascript quota, or allow X+Y amount of JavaScript as that would make it very slow.

igrigorik commented 7 years ago

Another, more concrete real-world example: AMP limits amount of CSS to <50kb. One could imagine setting a CSS policy on themselves to advertise that they're ~AMP compatible, as a signal to embedders and other third parties: <50KB CSS, total size <1MB, etc.

jkarlin commented 7 years ago

Interesting. I wonder how we'd attribute resources to their type. Based on the element it's from? That doesn't help for XHR, which is where most video comes from. We could use the response content-type, but of course that could be a lie.

I'm also curious if we'd want to be able to specify limits for everything but some type.

igrigorik commented 7 years ago

Interesting. I wonder how we'd attribute resources to their type. Based on the element it's from?

I think we could (re)use Request#destination, which we introduced to address similar case for Preload: https://w3c.github.io/preload/#link-element-extensions.

With above, the policy would be "video: X KB, image: y KB, fetch: z KB"... Yes, if you're using XHR to stream video, then it would be counted towards fetch quota, but I think that's OK and correct..

tomByrer commented 7 years ago

Limits on specific content types is a great idea!

But what if the video is replayed, eg tutorials, lie-fi playback....? Sometimes videos are cached, sometimes not, like if one has to restart under a different bitrate.

igrigorik commented 7 years ago

@tomByrer this API is for limiting "network transfer bytes", i.e. bytes fetched from the network. If the user retrieves a resource from local cache, without hitting the network, it won't be counted against the specified quota. On the other hand, if you're streaming and rewind.. and re-fetch some section of the video, then it will be counted as long as there is network activity behind the scenes.

jkarlin commented 6 years ago

I'm game for this. We need some syntax. First, any objections to defaulting to KB as the unit? Then we use integers instead of strings. Perhaps:

<iframe transfer-threshold="video: w, image: x, default: y, total: z">

For convenience,

<iframe transfer-threshold=200>

is shorthand for:

<iframe transfer-threshold="total: 200">

Would the same syntax be used in the response headers?

Transfer-Thresholds: {"default":"total: 200", "self":"video: 100, default: 500", "*.example.com":"100"}

styfle commented 6 years ago

I think video and image are too generic. Instead, use a MIME type for thresholds such as:

text/html
image/png
audio/ogg

Then if you want to set a threshold for all images, you can use the image/*.

jkarlin commented 6 years ago

It's unclear to me how we would get the mime type. We can't rely on the response headers to tell the truth. If all the script does is xhr a resource and store it on disk, how can we know what to attribute it to?

styfle commented 6 years ago

It's unclear to me how we would get the mime type.

How do you plan on getting the network traffic for the embedded iframe in order to enforce the thresholds? Surely if you are reading network traffic, you can also read response headers.

We can't rely on the response headers to tell the truth.

That's okay. Chrome won't execute javascript if it is served with the wrong MIME type. See this question.

If all the script does is xhr a resource and store it on disk, how can we know what to attribute it to?

I think you would do the same thing I wrote in part A...inspect the response headers to check the MIME type, regardless of the source (XMLHttpRequest, <script>, <link>). You just need a hook into the iframe's network I/O.

jkarlin commented 6 years ago

Right, the response content-type header could be a lie. The server could say it's all text/html when in fact it's delivering audio/ogg. This happens all the time. I'd much rather have something truthful, which we get by attributing the resource to the type of the element that requested it.

styfle commented 6 years ago

Are you suggesting that the contents of the iframe only be audio/ogg and no html would be emitted from the iframe's host server? Something like the following?

<iframe src="http://example.com/video.ogg" width="100px" height="100px" />

jkarlin commented 6 years ago

No, I'm suggesting that a frame might want to violate its per-mime-type resource limit, by purposefully falsifying the content-type headers of its responses.

E.g., the frame might have a limit of 100KB for audio/ogg, imposed by the embedder. To get around the limit, the server could respond with audio/ogg resources that have content-type set to something else, such as text/html.

styfle commented 6 years ago

I see. But I don't understand how your suggested syntax above would solve this problem:

<iframe transfer-threshold="video: w, image: x, default: y, total: z">

How is video or image enforced if not by the MIME type?

jkarlin commented 6 years ago

It's instead categorized based on the source of the request. See the following table: https://fetch.spec.whatwg.org/#concept-request-destination

styfle commented 6 years ago

I understand now, thanks! That seems to be the way to go. My concern was that video or image is not standardized but I see now that it is standardized as "concept request destination". CSP labels video as media-src which is what the table in your link shows.

So my next question is, why not use CSP directives instead of the concept request destination?

nbeloglazov commented 6 years ago

Having per-type limits doesn't address "lazy loaded" issue that was mentioned in the first comment where user might play another video afterwards. Basically initially iframe can load Xkb and then load additional Ykb once user interacts with the iframe (for example plays next video or load more images if iframe is a gallery with images and user starts swiping). It should be possible to differentiate between X and Y. I don't know how but one random suggestion is to include "is iframe focused/selected/being interacted with" boolean in an event object.

jkarlin commented 6 years ago

So my next question is, why not use CSP directives instead of the concept request destination?

Mostly because it's defined in the Fetch spec.

It should be possible to differentiate between X and Y. I don't know how but one random suggestion is to include "is iframe focused/selected/being interacted with" boolean in an event object.

I worry about the privacy implications of that. Is there any way today to determine if a user interacted with a frame? I guess they could track the mouse entering and leaving it if the page surrounds it?

nbeloglazov commented 6 years ago

Yes, I see your point. Mouse detection would be insufficient as it works only on desktop (right?) and mobile is more important wrt bandwidth.

I just wanted to raise this issue as I believe it's important to differentiate bytes that always loaded from bytes that loaded only when user interacts with iframe. It's important to keep always-loaded bytes to minimum but lazy-loaded can have less tight limits. For example precache few minutes of a video and load the rest only when user starts watching.

tomByrer commented 6 years ago

it's important to differentiate bytes that always loaded... but lazy-loaded can have less tight limits

I'm sure there are some scenarios where this ability would be helpful, but I think there will be many scenarios where total bytes is important, perhaps more so? I'm wondering if the extra complexity is worth it?

I do like the idea of seeing if there should be a separate limit for initial load, & 'load more if interacted with'. Eg: Preload the video player & preview, & if [ Play ] or "See next scene?" or simply moving the time-scrub knob over are more friendly prompts to keep loading the video rather than a standard browser prompt: Site is exceeding your bandwidth threshold, [ proceed ] or [ stop ]?

tomByrer commented 5 years ago

Any progress with this spec? Thought of this in a thread about "importance" & this semi-related abuse scenario.

jkarlin commented 5 years ago

The hard problem with TransferSizePolicy, and the reason it hasn't seen the light of day, is the cross-origin size leak. We don't have a solution that's particularly satisfying as it reveals new information (the size of an entire page) and not just the size of a single resource.

WICG / transfer-size

Scenario: video playback #8