whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
7.99k stars 2.61k forks source link

Figure out what sort of special-casing IE and Chrome do for data: (and what other URIs)? #656

Open bzbarsky opened 8 years ago

bzbarsky commented 8 years ago

It looks like the "Check the usability of the image argument" algorithm from the spec returns "good" for data: images in IE and Chrome even when per current spec it should not. Testcase goes like so:

var img = new Image();
img.src = 'data:stuff'; // Need actual valid image data: URI here
var canvas = document.createElement('canvas');
var pattern = canvas.getContext("2d").createPattern(img, 'no-repeat');
canvas.getContext("2d").drawImage(img, 0, 0)

Per spec, pattern should be null here (modulo #655) and there should be nothing painted to the canvas. See https://bugzilla.mozilla.org/show_bug.cgi?id=1164458#c9 for the spec analysis. It sounds like IE and Chrome are doing something else instead. It would be good to understand what, exactly, this "something else" is...

@domenic @travisleithead @jacobrossi any idea who would have that info?

domenic commented 8 years ago

I'll summon @junov as our Blink canvas expert...

junov commented 8 years ago

Blink's implementation is meant to throw an exception in createPattern if the image is undecodable. It seems that we have a bug here. Our tests only verify that the exception gets throw using an actual file, not a dataURL. Let me debug a bit to see what is going on....

junov commented 8 years ago

In blink the state of the image is incomplete at the time createPattern is called, because the implementation has not yet attempted to "load" it. I tried this:

var canvas = document.createElement('canvas');
var img = new Image();
img.onload = function () {
    var pattern = canvas.getContext("2d").createPattern(img, 'no-repeat');
    canvas.getContext("2d").drawImage(img, 0, 0);
}
img.src = 'data:stuff'; // Need actual valid image data: URI here

But the onload handler never gets called because the load fails. Then I tried an arbitrary delay to allow the decode attempt to pass:

var canvas = document.createElement('canvas');
var img = new Image();
img.src = 'data:stuff'; // Need actual valid image data: URI here
window.setTimeout(function () {
    var pattern = canvas.getContext("2d").createPattern(img, 'no-repeat');
    canvas.getContext("2d").drawImage(img, 0, 0);
}, 1000);

This threw the expected exception.

I guess the real difference between implementations is that in Firefox the img transitions synchronously to the loaded state when setting .src to a data URL?

bzbarsky commented 8 years ago

@junov Did you use an actual valid image data: URI like the comments said?

For example (from https://bugzilla.mozilla.org/show_bug.cgi?id=1217031 but with slight modifications), try http://codepen.io/anon/pen/rxQmWM?editors=001 -- in Blink this alerts something that's not null and draws a little red asterisk on the canvas. Per spec, it should alert null and not draw a red asterisk, as far as I can tell.

This threw the expected exception.

Why would it throw an exception at all? I see nothing in the spec that would make any of this throw an exception.

is that in Firefox the img transitions synchronously to the loaded state when setting .src to a data URL

That's not what happens at all in Firefox. As far as I can tell it is what happens in Blink, at least for purposes of "Check the usability of the image argument"...

Note that you probably need to restart the browser between loads of the above testcase, since after the first load the data: URI will be in the document's list of available images.

junov commented 8 years ago

Sorry, I was using "data:stuff" (a bad data URL). I thought that was what this issue was about. Okay, so I can repro with the codepen. Why exactly is "Check the usability of the image argument" not supposed to return 'good'?

junov commented 8 years ago

Is your point that the image from the dataURL should not be available until the load event has been dispatched?

bzbarsky commented 8 years ago

Why exactly is "Check the usability of the image argument" not supposed to return 'good'?

Did you read https://bugzilla.mozilla.org/show_bug.cgi?id=1164458#c9? It walks through the spec step by step explaining why.

But the short summary of it is that the image can only become "available" via an async step in the spec, unless it's it's already in the document's "list of available images". Images don't have to wait for onload to become available (esp. because onload fires off a separate task anyway), but they do have to wait for at least a stable state afaict.

junov commented 8 years ago

Okay, I read the spec analysis in the Mozilla bug. I am fully caught up now.

This "list of available images" business seems kind of fuzzy to me. From the page's standpoint, whether or not an image is to be expected to be in the "list of available images" should be somewhat unpredictable. So if an implementation always treated data: URIs as if they were in the "list of available images", in theory, that should be Okay. But I can see how in practice this could be a source of incompatibility between browsers. For example, for an app generates unique images every time.

Treating "data:" URI images always as immediately available seems logical and predictable IMHO, though that is not what the spec requires.

bzbarsky commented 8 years ago

From the page's standpoint, whether or not an image is to be expected to be in the "list of available images" should be somewhat unpredictable.

The main web compat constraint I'm aware of is that once onload has fired for some image URL in a document, it must be in that list and needs to stay there, more or less. That's "image preloading".

Treating "data:" URI images always as immediately available seems logical and predictable

It does require quite a number of sync steps on the main thread, though (converting the data: URI to bytes, extracting the image header information from those bytes, being ready to do a sync decode of the actual image data), right? Forcing that on browsers in general imposes some serious restrictions on possible image library architectures and abilities to take advantage of parallelism...

bzbarsky commented 8 years ago

Anyway, I'm particularly interested in what it is exactly that Blink does here, and whether it's limited to data: URIs or also happens for other URIs... and if so for which ones.

natechapin commented 8 years ago

Yeah, blink's behavior here is to parse data: URIs synchronously, put it in blink's in-memory cache, and emulate a cache hit, which can complete synchronously depending on the data type (images return the data synchronously, among others).

I don't know the full history of why it was implemented sync, but I'm guessing it was perf-related; I tried to make all cache hits asynchronous several years ago and had it rolled out do to perf regressions. Maybe we should try again to make it all async.

bzbarsky commented 8 years ago

OK. Is it just data: URIs? What about blob: URIs?

zcorpan commented 8 years ago

@yoavweiss any insight here? You were looking at data: URL for img and sync/async loading at some point, right?

Related https://github.com/ResponsiveImagesCG/picture-element/issues/223

zcorpan commented 8 years ago

https://codereview.chromium.org/1492303003/ was the fix for Chromium and it was reverted.

The reason for reverting is: Perf regressions: https://code.google.com/p/chromium/issues/detail?id=569023.