whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.1k stars 325 forks source link

5.6 garbage collection makes GC observable via service worker & resource timing #810

Open rniwa opened 5 years ago

rniwa commented 5 years ago

5.6. Garbage collection says "Observable through script" means observable through fetch()’s arguments and return value. Other ways, such as communicating with the server through a side-channel are not included."

However, this would reveal GC timing via service worker and resource timing. The resource timing is particularly problematic because it would make it easier to depend on GC timing.

rniwa commented 5 years ago

@youennf @yoavweiss

annevk commented 5 years ago

Sigh. Gotta love GC.

So once you've observed a fetch through resource timing or service workers, GC for objects associated with that fetch ideally wouldn't happen. For resource timing this seems somewhat doable since it's all in the same agent. Crossing agent cluster boundaries though...

How does detection through service workers work? I can think of two channels:

  1. Did it get a fetch event for the fetch.
  2. Was the fetch canceled (I'm not entirely sure if we always report canceled fetches, especially those due to GC, through the cancelation API though).

cc @igrigorik @jakearchibald @jungkees @wanderview

annevk commented 5 years ago

This also affects EventSource and XMLHttpRequest I guess? And perhaps most other APIs that do any kind of fetching.

rniwa commented 5 years ago

The problem with resource timing is that it gets enququed when the request gets canceled or when it completes. Canceling it early because the fetch object is unobservable via GC would make this observable. This is particularly problematic because some scripts might be expecting resource timing entries to be added and act on them upon insertion using PerformanceObserver.

annevk commented 5 years ago

Ah great, so the mere existence of resource timing prevents all early termination of fetches due to GC. (This is somewhat similar to the service worker case though.)

wanderview commented 5 years ago

If I understand correctly this is about fetch() calls where the result is not observed by script and the spec wants to allow the browser to detect this condition to auto-cancel the request. Is that correct?

Does any browser currently try to implement this?

As a side note, it seems like fetch() calls could reasonably have side effects on the server that the browser can't reason about. So I think this kind of auto-cancel might be surprising to developers. (I'm afraid past-me argued for this feature in the past, though.)

annevk commented 5 years ago

@wanderview you're correct and we have similar requirements for WebSocket, XMLHttpRequest, etc. I suspect browsers implement this at least for WebSocket, but that wouldn't be impacted here much as I don't think it's logged in the same way.

youennf commented 5 years ago

AFAIK, WebKit does not implement this (fetch, XMLHttpRequest, EventSource). Quickly looking at WebSocket implementation, it does not seem to be implemented this way either, though that might be interesting in that case.

ricea commented 5 years ago

Chrome won't abort a fetch due to garbage collection. We also aren't able to GC XMLHttpRequest when it is in action.

However, with future refactorings, cancel-on-GC might become the natural semantics.

We also don't aggressively implement the WebSocket GC semantics described in https://html.spec.whatwg.org/#garbage-collection-3, although we probably could without too much trouble.

yutakahirano commented 5 years ago

While it's not implemented I think auto cancellation is useful for users. It is very easy to forget to read all data or abort the pending fetch.

fetch(url); // This will leak!

Please note that, even if you get a response, you need to read all the response body.

let resp = await fetch(url); const headers = resp.headers; resp = null; // This will leak!

I've seen this pattern many times in WPTs.

Also, the backpressure mechanism makes the situation worse, as it will keep the pending fetch indefinitely.

youennf commented 5 years ago

I would believe that a fetch would be collectable at least when the load finishes, no matter whether all data is read or not. I wonder whether some applications actually rely on the current behavior (no GC until load finishes), especially if it is consistent across browsers. I could also see some web apps use 'fetch(url)' in a similar way to beacon.

yutakahirano commented 5 years ago

I would believe that a fetch would be collectable at least when the load finishes, no matter whether all data is read or not.

@youennf, That is what I was mentioning at the last sentence.

Also, the backpressure mechanism makes the situation worse, as it will keep the pending fetch indefinitely.

rniwa commented 5 years ago

Yeah, I'm skeptical that GC stopping a network load is web compatible if all major browsers are currently not exhibiting that behavior.

Is any browser actively pursuing & intending to ship that behavioral change? Perhaps we should remove this clause from the spec until such a time is reached that at least one major browser has shipped this behavior change, and at least one other vendor is moving forward with such a behavior.

annevk commented 4 years ago

So this issue is asking for removal of GC allowances, but Chrome per https://bugs.chromium.org/p/chromium/issues/detail?id=805069 already doesn't always add things to the resource timing registry.

annevk commented 3 years ago

So @noamr and I landed https://github.com/whatwg/fetch/commit/73f01f7b7e51626231e1911390b8dee669090b6d and unfortunately I searched for related open issues afterwards so I didn't see this in time.

It seems that the Chrome bug above suggests that unless you actively do res.blob() or equivalent, no PerformanceResourceTiming object is created. When I talked with @noamr about this I think he said the opposite was happening so maybe that changed in the interim? If it hasn't changed in the interim we need to change the approach for fetch() here.