whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.1k stars 326 forks source link

Consider adding load balancing / failover to Fetch? #775

Open dcreager opened 6 years ago

dcreager commented 6 years ago

For background, the Reporting API defines a load balancing / failover mechanism for uploading reports to a collector. This mechanism is implemented client-side, with the user agent receiving a prioritized and weighted list of URLs (inspired by DNS SRV records), and choosing one according to those instructions at upload time. The Network Error Logging spec, in particular, needs the load balancing / failover to happen on the client, instead of relying on existing server-side techniques, since its signal is most useful in the presence of partial network connectivity issues that would prevent the report uploads from reaching a single canonical upload URL. Since the user agent has already received a list of failover URLs before the connectivity issue begins, it has several options that hopefully let it upload the reports "around" the connectivity issues.

@mnot suggested in w3c/reporting#93 that this logic might be more generally useful. For now, we're keeping the logic defined in Reporting, but if other people / specs would find this logic useful, we could consider adding it to Fetch proper.

DNS SRV records are exactly what we need, although communicating this information through DNS might not be secure enough; we'd probably want to limit this to secure origins. @mnot also proposed communicating the list of load balancer / failover URLs via Origin Policy.

sleevi commented 6 years ago

As noted, expressing this via SRV is undesirable for a host of reasons - most notably, in practice, it does not work with any degree of fidelity from clients. This is something that Chrome has "strong negative signals" - we've closed those feature requests as WontFix rather intentionally.

In general, the cost of making something "generally useful" is to introduce a host of other concerns. I would think a very clear explainer doc explaining the use cases beyond NEL should be written before having more concrete discussions on this, in order to guide the discussions and the priorities. The state machines that implementations and developers need to maintain to effectively handle this client-side would be quite complex (for both parties), and it is not clear that users would concretely benefit from this - indeed, if the edge cases are ample enough, the user experience may be actively harmed in the aggregate.