Open marcoscaceres opened 4 years ago
On initial read, the Mozilla standards position makes a persuasive case tha this is harmful. It creates fingerprinting risk and yet provides info that is not sensibly actionable for a webpage. @hober csn you help collect feedback internally so we can come to an official position?
<chair-hat>
Generally, I don't think we should be archiving repos for specifications of shipped features, even if they are only shipped in a single engine.
Software's never "done". Web developers need a place to file issues against such specifications. Those specifications can evolve over time, even if the pace of change can be slowed down once the feature is shipped. We need a place to maintain those specifications. I think it'd be great if it can be here.
At the same time, you're correct that such specifications are no longer being actively incubated. Maybe we can label them as such ("Shipped in a single engine") without archiving?
</chair-hat>
<spec-proponent-hat>
More specifically, reading through the Mozilla position, it seems that Mozilla finds value in at least some of the use-cases that this specification is solving, but disagrees about the way in which it solves them.
I'd have preferred if this feedback came in the form of issues against the spec.
FWIW, I'm more than happy to discuss with Mozilla and others ways in which we can slim down the specification to its core value (IMO - the low-entropy EffectiveConnectionType
signal) and potentially add back a signal for metered
connections, if we have line of sight to implementing it in a user-meaningful way.
</spec-proponent-hat>
We can certainly try, but seriously need to scale this back. Can we maybe just try to metered (https://github.com/WICG/netinfo/issues/84)?
metered and effectiveConnectionType
? :)
(I wrote a long comment on the Mozilla position thread that can be summed up to that)
I don't know... the effectiveConnectionType
seems like something we will constantly need to keep adding to... maybe if it was just "slow", "average", "fast"
and we can update what those mean from time to time.
Adding new values once every decade (when new cellular technology is introduced) seems easier than changing the semantics of existing values. But let's discuss in #85
@marcoscaceres wrote:
[This incubation] hasn't successfully gained the cross browser support we had hoped for in the last 7 years. Thus, I'd like to propose archiving this incubation.
I agree.
@othermaciej wrote:
On initial read, the Mozilla standards position makes a persuasive case that this is harmful.
Indeed.
It creates fingerprinting risk and yet provides info that is not sensibly actionable for a webpage. @hober can you help collect feedback internally so we can come to an official position?
Our position is that this is harmful as specced, due to privacy and fingerprinting concerns.
Also, this API is probably not what developers want most of the time anyway. Developers want to understand what effective bandwidth is available, and that's best determined by measuring actual bandwidth instead of asking the browser to guess. I suppose developers might also want to know if the connection is or isn't metered.
@yoavweiss wrote:
Generally, I don't think we should be archiving repos for specifications of shipped features, even if they are only shipped in a single engine.
At the very minimum, specifications of single-engine features should be clearly marked as such. Ideally with a big red modal like WHATWG Review Drafts, stating that the spec documents the implementation of a feature that is implemented by only one engine, that it's unlikely to be implemented elsewhere, and that developers should refrain from using the features defined in it.
Maybe I should file a followup to WICG/admin#64 that suggests this approach more generally.
Developers want to understand what effective bandwidth is available
That's effectiveType
that's best determined by measuring actual bandwidth instead of asking the browser to guess
If you're suggesting active bandwidth measurements, that's typically harmful both for performance and for users' bandwidth costs.
@hober, are you suggesting a server-side bandwidth measurement? i.e. "how fast am I pushing traffic to this particular client?"
@hober, are you suggesting a server-side bandwidth measurement? i.e. "how fast am I pushing traffic to this particular client?"
Yes.
@hober, are you suggesting a server-side bandwidth measurement? i.e. "how fast am I pushing traffic to this particular client?"
Yes.
Server side bandwidth measurements are impractical for various reasons:
There's no reporting infrastructure among all those different components that can send the "bandwidth measurements" to a single point. Beyond that, some of the use cases call for client-side decisions based on that information. How do you expect to do that with server-side measurement?
In summary, it's extremely hard, if not impossible, to correctly measure bandwidth from the sender(s) on today's Internet. It's significantly easier and cleaner to measure it from the (single) receiver, and act on that measurement from there.
I do not think we should archive repos from WICG that have shipped in browsers - primarily because archiving a repo locks it down as read-only. Issues cannot be created or commented on - and for example, we couldn't have this kind of conversation about what might make this a better, cross-platform-capable API.
It's unfortunate that Github doesn't allow labeling (tagging) of repos, because I think that would work pretty well here. We could: 1) put a standard template header in the README.md of all WICG repos, that stated current status (possibly vendor signals and references?) 2) revamp the WICG.io index to include signals
I am opposed to archiving as a matter of course.
Maybe there could be a separate WICG-Attic org for specs that are no longer in incubation (and thus not really in scope for WICG any more), but also with no path to get on the standards track? That plus a prominent message along the lines suggested by @hober would allow continued evolution with less potential for creating confusion
(For this specific spec, if we can change it to something likely to see broader implementation, then of course that would be even better.)
Reading closer, it seems to be connectionType
and effectiveConnectionType
are not likely to be useful to apps, or at least won't achieve positive user outcome on net.
Regarding connectionType
, the cellular
type spans everything from EDGE to 5g, which is such a side span of bandwidth and latency characteristics that it's basically useless. Likewise wifi
and ethernet
, which could be a very wide range depending on specific tech and characteristics of the uplink.
The spec suggests that effectiveConnectionType
should be "determined using a combination of recently observed rtt and downlink values". If it's meant to be rtt and downlink to the same server, then it's not clear why the server couldn't measure it. Most of the issues raised by @yoavweiss would impact the client's measurement too. If it's meant to be rtt and downlink to any server, then there's a risk it creates an unacceptable side channel. The spec doesn't address side channel risk.
downlinkMax
doesn't seem useful as written, no decision should be made based on only theoretical characteristics of the first hop.
Overall, these things don't seem worth the privacy cost, and the Privacy Considerations section is dismissive about the relevant privacy issues.
Overall, it does not seem like a good idea for web apps (or native apps) to make decisions based on guesses about the network path between client and server. For example, adaptive streaming works without the need for APIs like this because it observes the actual bandwidth/latency and adapts.
It does seem like knowing if the connection is metered or not can help a web app make informed decisions that benefit the user. (Assuming the underlying platform reliably knows this info and can share it with the browser.) It's wrong to assume that all cellular connections are metered or that all wifi connections aren't, so that feature isn't actually provided by the spec as it stands.
Reading closer, it seems to be
connectionType
andeffectiveConnectionType
are not likely to be useful to apps, or at least won't achieve positive user outcome on net.
Can you elaborate on why you think effectiveConnectionType
is not likely to be useful?
Evidence suggests otherwise.
Regarding
connectionType
, thecellular
type spans everything from EDGE to 5g, which is such a side span of bandwidth and latency characteristics that it's basically useless. Likewisewifi
andethernet
, which could be a very wide range depending on specific tech and characteristics of the uplink.
I don't disagree and willing to work on removing that from the spec.
The spec suggests that
effectiveConnectionType
should be "determined using a combination of recently observed rtt and downlink values". If it's meant to be rtt and downlink to the same server, then it's not clear why the server couldn't measure it. Most of the issues raised by @yoavweiss would impact the client's measurement too.
That's not true. It's significantly easier to measure effective throughput on the receiver side than on the sender side. There are multiple layers of "senders": application server, web server, load balancer, network "traffic shapers", CDN edge servers. On top of that, the decoupling of userland code from the TCP stack in the kernel means extra buffering happens at every one of those nodes along the way. Contrary to that, there is only one meaningful "receiver" we want to measure - the browser.
If it's meant to be rtt and downlink to any server, then there's a risk it creates an unacceptable side channel. The spec doesn't address side channel risk.
Currently it is meant as an aggregate of past visited origins, and the risk of cross-origin leaks is mitigated. Do you see risks that those mitigations do not cover?
Aside: We really should outline those mitigations as part of the spec. Apologies for that.
downlinkMax
doesn't seem useful as written, no decision should be made based on only theoretical characteristics of the first hop.
I don't disagree.
Overall, these things don't seem worth the privacy cost, and the Privacy Considerations section is dismissive about the relevant privacy issues.
I believe we can lower the "cost" (by removing the less useful parts). Agree that the Privacy Considerations section can be improved.
Overall, it does not seem like a good idea for web apps (or native apps) to make decisions based on guesses about the network path between client and server. For example, adaptive streaming works without the need for APIs like this because it observes the actual bandwidth/latency and adapts.
Adaptive streaming is indeed an excellent example, as the client is responsible for requesting the adapted stream, based on bandwidth measurements. Its adoption shows that it is useful to adapt the content to available network conditions.
ECT enables something extremely similar for the very different medium of websites: The browser performs bandwidth measurements, and that can inform client or server side logic as to what "bitrate level" of experience should the user be provided with.
It does seem like knowing if the connection is metered or not can help a web app make informed decisions that benefit the user. (Assuming the underlying platform reliably knows this info and can share it with the browser.) It's wrong to assume that all cellular connections are metered or that all wifi connections aren't, so that feature isn't actually provided by the spec as it stands.
I agree. This is being discussed in https://github.com/WICG/netinfo/issues/84
Can you elaborate on why you think effectiveConnectionType is not likely to be useful? Evidence suggests otherwise.
I may be underestimating how accurately it can be computed. But it seems like the computation involves looking at communication to multiple servers (so side channel risk) and thus is potentially inaccurate if the network path to different servers have different characteristics. A website saying that they want to use it is not very good evidence that it's good for the job.
Currently it is meant as an aggregate of past visited origins, and the risk of cross-origin leaks is mitigated. Do you see risks that those mitigations do not cover?
Aside: We really should outline those mitigations as part of the spec. Apologies for that.
Yeah, I don't think you can expect readers of the spec to know that an open issue with no PR describes crucial mitigations for a privacy problem with the spec. If you're saying that Chrome implemented this and considers it essential, then it should definitely go in the spec.
Unfortunately, the issue itself dives right into describing some mitigations without describing what problem it is trying to solve or how those mitigations address it, w hick makes it hard to evaluate whether they are enough.
I may be underestimating how accurately it can be computed. But it seems like the computation involves looking at communication to multiple servers (so side channel risk) and thus is potentially inaccurate if the network path to different servers have different characteristics. A website saying that they want to use it is not very good evidence that it's good for the job.
What we have is web developers saying that they are using it (in browsers where this is shipping) to provide improved analytics and differential content serving and experiences based on it. So I think it's safe to assume it's doing a reasonable job.
Regarding measurements from different servers, that's definitely prone to be skewed (e.g. if bandwidth to one server is significantly lower than to others), but the underlying assumption is that the last-mile is typically the bottleneck, at least in the cases we care about (slow networks).
I'd love to dive into the side-channel risk you mentioned and better understand it. Are you concerned that:
Outlining the threat model would help us assess if current mitigations are sufficient.
Mozilla has pref'ed off this Netinfo on Android: https://groups.google.com/a/mozilla.org/g/dev-platform/c/u1QiOGUIUfk/m/B1MnwUyuCAAJ
I opened issue #91 to push for one or more of the non-archiving, document-in-place options mentioned here to actually be done for this repo.
But after thinking about it a while, I am warming to the suggestion in https://github.com/WICG/netinfo/issues/82#issuecomment-618628457 of having a separate org for WICG specs that are no longer in incubation.
I think it dilutes the incubation intent of WICG to have it host single-implementation things that no longer have a path to standardization. In the same way that WICG specs that successfully incubate get handed off to a standards group, WICG specs that do not gather multi-vendor interest should be handed off somewhere else to document the implementation. WICG should have a narrow focus on specs that still have a chance to succeed as standards
Looks like we have at least a tendency toward agreement on proceeding with limiting the spec to just effectiveConnectionType
(with modifications to how it works today, see #85) and the introduction of a metered
flag (#84).
The Save-Data
header and the prefers-reduced-data
CSS user preference media feature seem to solve for the other use cases for both client and server.
I see no harm in leaving saveData
in the spec. It is different from metered
in that you may just want to be a good citizen and not tax a shared Wi-Fi too much, e.g., in a shared apartment.
I think this incubation might have run its course as it hasn't successfully gained the cross browser support we had hoped for in the last 7 years. Thus, I'd like to propose archiving this incubation.
Mozilla remains fairly opposed to this work (with the rationale being that it is "harmful"): https://github.com/mozilla/standards-positions/issues/117
I'm not sure WebKit folks have taken a stand on it. @othermaciej?
cc @WICG/chairs