Image decode() algorithm has odd behavior

bzbarsky commented 5 years ago

Consider this testcase, loosely based on https://github.com/web-platform-tests/wpt/blob/a19211f85d6aa4cf7a88f30833a800ccb24126c4/html/semantics/embedded-content/the-img-element/decode/image-decode-path-changes.html#L62-L83. Assume all image loads succeed.

var first_promise, second_promise;
var img = new Image();
img.onload = function() {
  img.onload = null;
  first_promise = img.decode();
  img.src = "/images/green.png";
  second_promise = img.decode();
});
var img2 = new Image();
img2.onload = t.step_func(function() {
  img.src = "/images/red.png";
});
img2.src = "/images/green.png";

What should happen here?

Let's start in the onload handler for img. The first decode() call happens, lands in https://html.spec.whatwg.org/multipage/embedded-content.html#dom-img-decode, which creates a primise, queues a microtask, and returns the promise. Nothing else happens so far.

Then the src set happens. This is a https://html.spec.whatwg.org/multipage/images.html#relevant-mutations so per https://html.spec.whatwg.org/multipage/images.html#when-to-obtain-images we immediately jump to https://html.spec.whatwg.org/multipage/images.html#update-the-image-data. We have an src attribute and no srcset or picture, so in step 4 our source is set to the "green.png" url. Then in step 6 substep 3, we get a cache hit on that url (because img2 has already loaded it), which sets the current request of img to a new completely available request with an empty "current URL" string (???), queues a task to set the current URL to the actual string involved, and returns.

The second decode() call happens, returns a promise, queues a microtask.

Now we enter microtask processing. We process those queued microtasks and they start the parallel sections. As soon as those sections start, we see that the current request's state is "completely available", so we start the image decode. Note that we don't start observing changes to the current request until the microtask is running, so the earlier change of current request while obtaining image data has NOT rejected either promise.

At this point it's a race between the decode and the task that was queued while updating the image data. If the decode finishes first, both promises will be resolved with undefined. If the task runs first, the "This img element's current request changes or is mutated" clause of the parallel section of decode() runs (because the "current URL" changes) and rejects both promises. Presumably. It's hard to tell, because this stuff is all unsynchronized racy code which is poking promises from potentially many different threads, etc. Unless the "resolve promise with undefined" does an implicit task to mess with the promise, which would make sense. In which case there is no raciness and the decode always loses (assuming it's using the same task queue; is it?) and the promises should both be rejected.

Note that in none of that did it matter whether the first url loaded in img was "red.png" or "green.png". The behavior, per spec, should be the same.

@vmpstr, @domenic, it doesn't look like Chrome is implementing what the spec says here. In particular, in Chrome first_promise ends up rejected and second_promise resolved in the above testcase, afaict, and if I change the first url in img to "green.png" both end up resolved. The test linked above has comments about there being something special about the "same url" case, but there is nothing special about that case in the spec.

I suspect the spec only starting to observe changes to the current request after the decode() microtask runs is a spec bug, by the way. That behavior doesn't make sense to me; it would make more sense to start observing that at the point when the decode() call happens, unless the intent is in fact for a decode() call before an src set to apply to the result of the src set. Fixing that would mean that src sets reject any promises from previous decode() calls, no matter whether the URL is changing, again afaict.

//cc @aosmond who is working on implementing decode() in Gecko and ran into this.

vmpstr commented 5 years ago

At this point it's a race between the decode and the task that was queued while updating the image data. If the decode finishes first, both promises will be resolved with undefined. If the task runs first, the "This img element's current request changes or is mutated" clause of the parallel section of decode() runs (because the "current URL" changes) and rejects both promises.

Can you clarify this part a bit? This is my reading of the spec: If we have two microtasks for the decode scheduled with a task in between the two to mutate the current state (by setting the current url), then we have something like D1 M1 D2. If they are to run in sequence then we enter the parallel section of D1 waiting for any of the stated conditions. At this point D1 decode can begin based on (what I think is a bit awkward phrasing of) update the image data. While the decode is happening, M1 runs to modify the url of the current state, thus mutating the current state. Here, we have to reject the promise generated by D1, but note that D2 is yet to run, so we're not in the parallel section (or any of the steps) of D2 yet. After M1, we can run D2 and observe a completely available image. With no other tasks scheduled to mutate the current state, the decode succeeds and the promise resolves.

This is of course based on the assumption that the task scheduled in steps 6.3.7 of 4.8.4.3.5 Updating the image data puts a task in the microtask queue (@domenic to confirm).

Note that in theory I agree that this is racy in that the decode can finish before M1 runs, resolving the promise. However, since the promise resolution callback happens on a microtask (afaik), we know that M1 will run before script can conceptually observe the fact that the decode finished, which kind of makes it inconsistent to resolve the promise?

Also as a side note about changing the url to the same value, the reason that doesn't impact the above reasoning is that the "current state" is not mutated (which is a word that is not well defined in the spec, afaict). So promises resolve.

bzbarsky commented 5 years ago

This is of course based on the assumption that the task scheduled in steps 6.3.7 of 4.8.4.3.5 Updating the image data puts a task in the microtask queue

It doesn't. It queues a task, not a microtask. It's explicitly linking to https://html.spec.whatwg.org/multipage/webappapis.html#queue-a-task which puts the task in a "task queue", not a microtask queue.

decode can finish before M1 runs, resolving the promise

This come back to whether resolving the promise happens off a task. Per https://www.w3.org/2001/tag/doc/promises-guide#shorthand-manipulating I believe it does, so when the decode finishes it actually queues a task to resolve the promise. So there's no race between the decode and the "change the current URL" task: the latter always wins the race, against both of the decodes here.

the reason that doesn't impact the above reasoning is that the "current state" is not mutated

It is, because M1 runs after D2.

bzbarsky commented 5 years ago

Also, I don't see any mention of "current state" anywhere in https://html.spec.whatwg.org/multipage/embedded-content.html#dom-img-decode. It talks about object identity of the "current request" changing, which is a well-defined concept, and the "current request" being mutated. The latter presumably means any of the members of https://html.spec.whatwg.org/multipage/images.html#image-request changing value, which is also a well-defined concept. In this case, the spec is changing the object identity sync during the src set and changing the "current URL" member from a task (which is a bit questionable, imo).

The spec is also not terribly consistent about whether setting src to the same value as the current request's current URL changes the current request or not. If https://html.spec.whatwg.org/multipage/images.html#updating-the-image-data step 6.3 is a hit, then it does. But if that's a miss, step 14 just resets the animation without changing the current request. That seems pretty dubious to me; maybe something like step 14 should happen before step 6? That seems to be the mental model you're using, but isn't what the spec says; I don't know what implementations do, off the top of my head.

vmpstr commented 5 years ago

The non-normative note on https://html.spec.whatwg.org/multipage/embedded-content.html#dom-img-decode seems to imply that updating the image data also happens in a microtask, but the spec doesn't agree. I agree that we need to fix this (the decode spec that is).

The basic intent of the API should be that

img.src = "foo.png";
img.decode(); // 1
img.src = "bar.png";
img.decode(); // 2

decode 1 rejects due to the fact that the src has changed and so it is ambiguous to resolve the promise (what has resolved? which image is done decoding?). However, the second decode should succeed because it's not ambiguous which src it's acting on. (Not ambiguous from the code perspective; possibly ambiguous from the spec perspective)

maybe something like step 14 should happen before step 6? That seems to be the mental model you're using, but isn't what the spec says; I don't know what implementations do, off the top of my head.

Yes, what I mean to say is that if the current request of the image is that of "foo.png" and the state is completely available, then steps 6.3.2-4 are effectively no-op if the new url is also "foo.png". But clearly from a strict spec perspective, we do need to abort the current request (meaning forget the current image data from https://html.spec.whatwg.org/multipage/images.html#abort-the-image-request) and then set it back to the same data from the cache.

So if in the above example both srcs are the same, it would be nice to resolve both decodes since conceptually no mutation has happened. However, we should either make this case explicit in the decode spec, or update the "update the image data" spec to allow for this situation where the "current state" is not mutated.

bzbarsky commented 5 years ago

So if in the above example both srcs are the same, it would be nice to resolve both decodes since conceptually no mutation has happened

This is, strictly speaking, not true. If we do a load of "foo.png" in one image, then do something that ends up clearing out the cache, then do a load of "foo.png" on a different image (which refills the cache, with possibly different data) and then set src to "foo.png" on the first image, if that reads from the cache it will in fact change the image data, right?

vmpstr commented 5 years ago

To rephrase (please correct me if I misinterpreted what you mean),

img.src = "foo.png";
img.decode();
// do something here that causes
// 1. the browser image cache to evict "foo.png", and
// 2. "foo.png" data changes on the server, then
img.src = "foo.png";

Yes, I think this should probably reject the decode.

It is also certainly the case that if there's a base tag that's injected with a different base url, then the content would be different although the string "foo.png" doesn't change, but I guess that's easier since the resolved full url would be different.

I do think that any subsequent decodes should succeed though (decoding the latest available data).

bzbarsky commented 5 years ago

I meant something more like this:

img.src = "foo.png";
// Do something here that causes:
// 1. The browser image cache to evict "foo.png", and
// 2. "foo.png" data changes on the server, and
// 3.  The new data is placed in the browser image cache.
img.decode();
img.src = "foo.png";

What should happen? The first question, even ignoring the "decode()" thing, is what data should end up being shown?

The base tag situation is, as you note, handled by the fact that all the actual url compares happen on fully resolved urls.

vmpstr commented 5 years ago

Ah, yeah I'm not really sure what happens in that situation if you just append the img to the DOM without decode(). Presumably if the new data is loaded, then the new data would display. It also seems to mean that doing the second img.src = "foo.png" doesn't conceptually change anything.

Do you agree that in the case of foo/decode/bar/decode the first decode should fail and the second succeed though? If so, I think we can start by figuring out how to fix the spec to represent that case.

For the src being changed to the same value, I think we also should mention something, but it becomes a bit trickier since it's not just the string we're talking about, as you brought up here, but the underlying data.

bzbarsky commented 5 years ago

It also seems to mean that doing the second img.src = "foo.png" doesn't conceptually change anything.

Why not? Until that happens, the image is pointing to the old data.

Do you agree that in the case of foo/decode/bar/decode the first decode should fail and the second succeed though?

I think that would make the most sense, yes.

At the same time, we should get the same-url case figured out, because right now the test are making assertions about it that are just not supported by the spec.

vmpstr commented 5 years ago

It also seems to mean that doing the second img.src = "foo.png" doesn't conceptually change anything.

Why not? Until that happens, the image is pointing to the old data.

I meant to say that if If you have that situation but instead of doing decode() you simply do something like document.body.appendChild(img); and the result is that you see the new image, then whether or not you set the src to the same value doesn't seem to change anything (ie you'd still see the same image). Reading through this again, I don't see how that would be the case though.

I agree if the image itself points to the old data in a way that, when inserted into the document, shows the old contents then we should reject the decode in the same way we would if the url was different.

I think we mostly agree conceptually on what should happen here, but I have a feeling that it might be hard to say in a spec language. Maybe I'm wrong.

I'm still unsure if you think that setting the src to the same value (without the cache manipulation example you provided; ie it really is the same data) should or should not reject any pending promises. What do you think in that situation?

bzbarsky commented 5 years ago

you simply do something like document.body.appendChild(img); and the result is that you see the new image

Why would that be the result? Being appended to the body is not a https://html.spec.whatwg.org/multipage/images.html#relevant-mutations unless there's <picture> involved. It doesn't change the image data being pointed to. Per spec, at least. What browsers do... I suspect Gecko does try to do a load here. But it allows it to short-circuit in some cases before reaching the "check the image cache" step. I can't speak for other browsers.

What do you think in that situation?

I don't have a strong opinion, actually. I mostly want the spec to make sense and browsers to actually be following it. Right now we have neither... What behavior makes the most sense here is entirely unclear to me, especially once srcset and picture start getting involved.

vmpstr commented 5 years ago

I mostly want the spec to make sense and browsers to actually be following it. Right now we have neither...

Yes, I agree. We should fix the spec to be more precise regarding all of the situations we've discussed here.

gterzian commented 4 months ago

I would like to re-start this discussion from a sightly different angle and try to focus on two problems I have noticed:

The in-parallel steps break the guidelines from Dealing with the event loop from other specifications, because they "wait" on a state update on objects from a specific global, and then they also resolve or reject the promise directly from those steps(no task is queued).
The micro-task logic seems to imply that by the time it runs, the current request's state can only be either broken, or partially available, and that any further changes in state will be noted by the parallel steps that are launched from that microtask. But the request can already be unavailable(see step 2 of update-the-image-data, and completely available(see step 7.4 of that same algorithm, when the image is found in the list of available ones).

I propose the following (high-level) fix:

Expand the microtask logic to reject(or resolve) the promise based on an exhaustive check of the relevant state at that point(so include the two points mentioned above, maybe others).
If the relevant state at that point does not allow the rejection or resolving of the promise, then store the promise in an list of "pending decode promises" tied to the element itself.
Whenever another algorithm, such as update-the-image-data makes a relevant mutation, apply that mutation to the current list of "pending decode promises". This would mean either rejecting them then and there, or launching a parallel "decode" operation, which would then have to queue a task to either resolve or reject the promises.
Add a note(where?), that every time a document stops being fully-active, all relevant "pending decode promises" must reject.

One last thing I wanted to clarify based on the discussion above.

In the example below:

img.src = "foo.png";
// Do something here that causes:
// 1. The browser image cache to evict "foo.png", and
// 2. "foo.png" data changes on the server, and
// 3.  The new data is placed in the browser image cache.
img.decode();
img.src = "foo.png";

From your discussion on what should happen with the promise(reject because the request is mutated), it seems to me that you are assuming that what currently are the parallel steps, for example "This img element's current request changes or is mutated", will immediately affect the promise. I don't think it should, because as per the current spec, those "parallel steps" have not been launched yet(they launch from the microtask, which hasn't run yet). So the point of the microtask, I think, is really to ignore this kind of intermediary mutation, and to only take into account the state as it is when the microtask runs(as well as state changes from then on).

whatwg / html

Image decode() algorithm has odd behavior #4217