WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
519 stars 219 forks source link

Update interest group algorithm sometimes runs on main thread, and hangs it #1221

Open domfarolino opened 1 month ago

domfarolino commented 1 month ago

This instance of "update interest groups" runs on the main thread (in response to an abort signal aborting). But "update the interest groups" hangs whatever thread it is on, by fetching in parallel and then manually waiting for the response. This technically hangs the main thread which is really bad. I'm positive the implementation doesn't do this, so we should fix the spec to match whatever the implementation does here. Maybe queue a task to run the interest group update algorithm, for example?

MattMenke2 commented 1 month ago

Nothing ever waits on "update interest groups" to complete. Can we have the entire update itself run off of the main thread, making this a moot point? I see the fetch spec talks about "waiting" for, e.g., "all the HTTP headers [to be] transmitted" - hrm, not actually seeing anything that makes the entire network stuff async, actually.

I think it would be better to just pretend this has nothing to do with the main thread, than having to worry about this in all the update code. We don't parse the update JSON in the main process, anyways, so that's another non-blocking hop. I don't think we want that in the spec, so I think we should invest in accurately reflecting every point of asynchrony, but should instead make the spec flexible enough that this does not matter.

domfarolino commented 1 month ago

Nothing ever waits on "update interest groups" to complete.

It's on the call stack which originates from the main thread, so the main thread waits on it implicitly.

Can we have the entire update itself run off of the main thread, making this a moot point?

Yep, that works. You'd either go "in parallel" in each of "update interest groups" callers, or just at the very top of the algorithm itself.

hrm, not actually seeing anything that makes the entire network stuff async, actually

https://fetch.spec.whatwg.org/#main-fetch:~:text=Note%20that%20the,set%20to%20true.

I think it would be better to just pretend this has nothing to do with the main thread, than having to worry about this in all the update code. [...]

I'm not sure I understand the proposal there... but I do know that the current spec does a ton of stuff very clearly on the main thread that it shouldn't, and we should probably branch off in parallel before, if that's what we really want, and if that's what the implementation does. I think that's a pretty straightforward solution.

MattMenke2 commented 1 month ago

Think of the update as behaving a bit like a DNS prefetch - it does work that may be useful in the future, but the update itself doesn't affect anything except what happens in the future. It doesn't affect the auction that triggered it, it instead may affect the next auction (If the update has completed in time). It's not an optimization, unlike a DNS prefetch, though.

Anyhow, I think an "in parallel" for the whole body of "update interest groups" makes sense, though I guess there's an update of the browser's centralized database at the end that we may have to queue a task for or something.

I'll take a swing at fixing this in the spec later this week.

FLEDGE does currently manage auctions mostly run on the browser's main thread (though does fetching and running JS in a separate process), though I'd like to change that, and the spec probably shouldn't present it as doing that.

domfarolino commented 1 month ago

though I guess there's an update of the browser's centralized database at the end that we may have to queue a task for or something.

This part is definitely important to get right. If multiple instances of this task have to be ordered with respect to some central mechanism, then you'll need to not use transient, ephemeral "in parallel" invocations, but you'll want a named parallel queue, like https://html.spec.whatwg.org/C#tn-session-history-traversal-queue, which is referenced from various places etc.

MattMenke2 commented 1 month ago

None of the updates are for the same interest group - if there are multiple update for the same IG, we have logic to only update once, so their order with respect to each other doesn't matter, though their behavior if they're trying to update an IG that had been left and/or rejoined since the update started is certainly relevant.

Looks like the spec doesn't currently cover how this is handled, so will need to spec out that, too, which should presumably happen before we fork off a thread. Anyhow, thanks for bringing that up! Lots of stuff to fix.

dmdabbs commented 1 month ago

so their order with respect to each other doesn't matter

As I understand current post-auction behavior, Chrome will update all of my update-eligible IGs provided it receives each updateURL's reply within the 30 second limit. Is that correct? If there is any limit to post-auction IG updates you can issue (or even if there is not), you might consider using IG priority to order them. B&A is ordering by priority when sellers limit buyers' IGs for shippping off-device.

MattMenke2 commented 1 month ago

I'm not sure about a 30 second time limit - I thought we just updated the IGs of all owners that were in the auction (with update rate limiting to prevent always updating every IG). Fetches themselves have a 30 second timeout, but I'm not aware of a 30 second total limit for updates. That's not to say there isn't one, just if there is, I'm not familiar with it.

I don't believe there's a limit on number. We order them by owner, and priorities don't apply between owners.

dmdabbs commented 1 month ago

Fetches themselves have a 30 second timeout, but I'm not aware of a 30 second total limit for updates.

Sorry if my wording implied a total limit. I know its only a per-fetch timeout:

The updates are rate limited to running at most daily to conserve resources, and timeout after 30 seconds;

Matt We order them by owner, and priorities don't apply between owners.

Yes, if priority were to be applied it would only make sense within the owner.