whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.03k stars 2.63k forks source link

Lazy load intersection-checking should not be done in-parallel #5236

Closed domfarolino closed 4 years ago

domfarolino commented 4 years ago

This is a follow-up issue for https://github.com/whatwg/html/pull/3752#discussion_r356604500. Basically it seems not great that we keep running the will lazy load image steps, which access main-thread state, from an in-parallel context. I see two ways around this:

  1. Observe the image with an intersection observer, so that the image intersection-checking steps are run as part of the update-the-rendering-steps, specifically as part of the run-the-update-intersection-observations-steps
    • We'd then need a way to react to "positive" intersections, which would entail resuming the suspended updating-the-image-data algorithm associated with the now-intersecting image
  2. Create something new in the update-the-rendering-steps that checks if any lazy-loaded images associated with a document intersect with the veiwport, and if so, "resume" the suspended updating-the-image-data algorithm for the image
    • This seems to require some per-document(?) list of lazy-loaded images, which would be populated when the decision of load deferral is made in the updating-the-image-data steps
    • This list would be consulted in the update-the-rendering-steps, and intersecting images would be resumed, and removed from the list
    • We could generalize this by making the list of lazy-loaded images instead a list of lazy-loaded elements. Each item in the list could be a tuple <element, lazy condition, resumption steps>, and we could reuse this infrastructure for lazy-loaded iframes as well

Creating something new (the second proposal above) is probably preferable, but might have a very small compat impact. Chrome currently implements lazy-loaded images and iframes with an intersection observer under the hood. Since the "new" steps that we'd add to the update-the-rendering-steps would run at a different time than the run-the-update-intersection-observations-steps, I wonder if it would be possible to observe this (that is, observe the ordering between IO callbacks and image load resumption).

@annevk What do you think about (2) above?

annevk commented 4 years ago

Why do you think 2 is preferable? Firefox will also do something akin to 1 as I understand it.

cc @hiikezoe

domfarolino commented 4 years ago

The reason I preferred (2) is because I remember there was some discussion on IRC about whether we should reuse intersection observer, or "piggy-back off of the async machinery" separately, and from what I remember Domenic preferred the latter, so I gathered that not depending on Intersection Observer might be the best way forward.

If you think we should do (1), I can focus on that after we get the main Lazyload PR merged.

annevk commented 4 years ago

Using intersection observer as our building block for all things "intersect the viewport" makes the most sense to me, but would be great to get @domenic's perspective too.

domenic commented 4 years ago

My original semi-objection to using intersection observer is that it contains both more and less than I was hoping it would.

In particular, the actual "does it intersect" step is vague: this algorithm step 4 delegates to this algorithm step 5: "intersecting it with the root intersection rectangle". That's no better than HTML's current "intersect the viewport", so when we initially we were trying to call out to the IO spec just to do a one-time determination of whether it intersects, I said we should stop doing that. We can just use the 1 core step, or its HTML equivalent, insteed of the surrounding 21 steps.

This leaves the infrastructure for tracking and notifying about interesections. That consists of another ~30 steps that are mostly not useful for our use case; they do a lot of calculation and DOMRectReadOnly construction and getBoundingClientRect() calls and so on, culminating in queue an IntersectionObserverEntry which then does another ton of work all just to call a callback, with a bunch of computed values which in our case we don't need.

If we were writing this from scratch in the manner of (2) I think we would end up with a much shorter and cleaner algorithm, which does not manipulate a lot of platform objects, in order to queue the relevant task. If we go with (1) we're going to have a fun time creating synthetic IntersectionObserver objects, DOMRects, and callback functions. It's messy.

So, with that all out of the way: probably (1) is still better, especially if that's the implementation plan. Creating two parallel systems just to avoid using the messy one is not a great strategy.

And, there are some steps in the IO spec that are good to reuse. For example, the handling of clip-path, or nested browsing contexts if we extend this to iframes. So that's another argument for reuse, i.e. for (1).

annevk commented 4 years ago

I see, I do agree with that. Perhaps IO should have a way for spec-level intersections that do not require as much non-spec infrastructure, so to say. We mostly care about there being a single place to define and expand on the intersection primitive and being in charge of relative ordering of algorithms to be invoked.

domfarolino commented 4 years ago

Thanks a lot for the input. It looks (1) is the way to go, which means the will lazy load image steps will create and register some synthetic IntersectionObserver (even if this is unwieldy) which will be responsible for queueing a resumption task when the image finally does come into view. We still need a way to perform an on-the-spot intersection test to even know if we should create and register the IO. It seems the only way to do this is use HTML's intersects the viewport dfn, which should be fine here. However, it sounds like this on-the-spot check will not cut it for iframes when that is introduced eventually, given:

For example, the handling of clip-path, or nested browsing contexts if we extend this to iframes. So that's another argument for reuse, i.e. for (1).

Regarding:

Perhaps IO should have a way for spec-level intersections that do not require as much non-spec infrastructure, so to say

This would be great. Maybe once we get this done, we can summarize specifically what would be nice for HTML to not have to worry about, and file a bug on IntersectionObserver so it can expose something cleaner that meets our needs, and potentially the needs of future primitives that require the same thing.

domfarolino commented 4 years ago

One complication came to mind if we try to do refactoring via (1), reusing the run-the-update-intersection-observations-steps in update-the-rendering-steps. The resumption of #updating-the-image-data is not just dependent on whether the image intersects the viewport. As currently specced, resumption is a function of (a) viewport intersection (b) loading attribute state (c) script-enabled-ness.

It seems tricky to me to spec (1) in a way that also considers (b) and (c), because I think we'd end up having to add something to the update-the-rendering-steps anyways, that checks for conditions (b) and (c) every time. Any thoughts?

annevk commented 4 years ago

I think (c) is constant so you can move that out. And when (b) changes you'd manipulate whether update-the-rendering checks for intersection or not, presumably?

domfarolino commented 4 years ago

I think (c) is constant so you can move that out

I was hoping that was the case.

And when (b) changes you'd manipulate whether update-the-rendering checks for intersection or not, presumably?

Originally I was thinking we wouldn't directly respond to setting the loading attr, but each iteration of update-the-rendering would see the latest attribute value, and determine whether the load should remain deferred, or be resumed via queuing a task. But I think you're saying we can just directly respond to loading=lazy => loading=eager by removing the associated intersection observer (if one exists, i.e., the image load is deferred), and queuing a task to resume loading?

annevk commented 4 years ago

Yeah. Probably have to double check that covers all code paths and what we want, but that's what I was thinking.

domfarolino commented 4 years ago

From https://github.com/whatwg/html/issues/5236#issuecomment-582338337:

We still need a way to perform an on-the-spot intersection test to even know if we should create and register the IO.

Currently the decision to defer an image load takes into consideration both the loading attribute value, and the in-viewport-ness of the image, via "intersect the viewport". The plan was to maintain this through the refactoring, and make the attachment of an intersection observer conditional on both the loading attribute value, and a one-time consultation of "intersect the viewport".

This is nice, because it means an in-viewport loading=lazy image would effectively load eagerly, at the same time it would if it were loading=eager. However, it's come to my attention that Chrome does not take into consideration the in-viewport-ness of an image when the load deferral decision is made. Only the loading attribute value is considered, because the browser is not likely able to make an accurate determination as to whether an image is in-viewport or not during #updating-the-image-data steps, because it is possible that the first render has not taken place yet.

With this, I'd like to know if it seems reasonable to make the attachment of a synthetic intersection observer conditional only on the loading attribute value, and stop consulting "intersect the viewport" altogether. Does this seem reasonable from a spec POV? It might also be nice to hear from other implementers (@hiikezoe / @smaug---- ?) to see if they'd naturally do this, and if it makes sense?

zcorpan commented 4 years ago

I think that is very reasonable. So the effect would be that a loading=lazy image will always defer loading until "update the rendering", and at that point make a decision about whether to load lazy images?

domfarolino commented 4 years ago

Yep

emilio commented 4 years ago

That also matches our implementation as far as I can tell.