Analytics mode - Githubissues

pes10k commented 4 years ago

There are lots of APIs that both (i) have concrete private risk, and (ii) are useful in the minority of cases to debug site, network or client issues.

This has caused lots of disagreement, heat and problems in privacy reviews.

Having an explicit "i am in debug mode, and I'd like to enable all the privacy-risk to help fix this problem, but just for a bit" option would help cut the knot

othermaciej commented 4 years ago

Can you give some examples of the kinds of APIs you’re thinking of?

pes10k commented 4 years ago

Sure! The two off the top of my head (Im sure they're more, will keep braining)

Everything behind Reporting API (some of it should just not be in the platform, but some seems like it could be useful to include w/ consent, like out-of-memory reports, CSP reports, maybe intervention reports)
WebRTC stats API

othermaciej commented 4 years ago

For some APIs like this, we've argued that the feature should be provided by developer tools, not as part of the web platform. But it's argued that they are needed in the field for telemetry purposes. Do you think Debug Mode will overcome such objections?

pes10k commented 4 years ago

I think a debug mode would allow us to split the knot. I am 100% certain that some parties will want every bit of info they can get, for a variety of purposes, but I think an explicit "debug" mode:

(i) is something users could actually understand (ii) could actually help unbreak sites and help developers (iii) wouldn't harm the privacy of the platform (e.g web w/ the defaults checked)

So yes, I think having a "on in debug mode" would flip the majority in a number of standards conversations

pes10k commented 4 years ago

Other APIs that would make sense here is basically everything that hangs of performance.*

othermaciej commented 4 years ago

Yeah, performance.* is exactly where I’m not sure we can get agreement, because the stated use case is gathering of field data across the whole user population, not lab testing or debugging an individual user’s problem.

(There might be solutions to mass telemetry that don’t leak individual user data, but asking lots of users to turn on debug mode probably wouldn’t cut it.)

pes10k commented 4 years ago

For my two cents, gathering of field data across the whole user population (w/o user consent or knowledge) should explicitly be a thing PrivacyCG / PING works to prevent. Turning every browser on the web into your debugging agent is not user-serving, and is tangled up w/ all sorts of privacy harm.

If we had a debugging mode though, we could at least have a standard way that automation, consenting users, etc could all interact with these features

othermaciej commented 4 years ago

Hmm. I'm thinking about how this is usually done for native apps or operating systems. Usually there is a one-time user choice to opt in or out of analytics. I could imagine a similar one-time user choice that gates websites access to all performance APIs and similar things. Or browsers could choose to make it a per-site preference.

An opt-in analytics/telemetry mode might be a different thing than a debug mode, if some web platform features truly exist for the purpose of web developers debugging live, rather than analyze broad field data.

pes10k commented 4 years ago

I don’t have strong feelings about whether it should be one toggle or two, as long as:

1) either / both are opt in 2) we can keep wide spread data collection out of the common path 3) we can enable cases where the data is highly useful

On Feb 22, 2020, at 12:53, Maciej Stachowiak notifications@github.com wrote:

Hmm. I'm thinking about how this is usually done for native apps or operating systems. Usually there is a one-time user choice to opt in or out of analytics. I could imagine a similar one-time user choice that gates websites access to all performance APIs and similar things. Or browsers could choose to make it a per-site preference.

An opt-in analytics/telemetry mode might be a different thing than a debug mode, if some web platform features truly exist for the purpose of web developers debugging live, rather than analyze broad field data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

TanviHacks commented 4 years ago

@pes10k @othermaciej - Would you like time on the teleconference this week to discuss this proposal?

pes10k commented 4 years ago

Sure, i think that'd be fine. I think conversation could be very brief top fwiw

jameshartig commented 4 years ago

We use performance.entries to collect script timing data from a small (random) percent of traffic in order to understand real-world performance of our platform. I think a way for a user to opt-out (or opt-in) of telemetry, like other platforms, makes sense. Additionally, if there's a particular issue in a particular region we might temporarily increase the percent in that region to gather more useful statistics for aiding in debugging or root cause analysis.

Getting rid of this access completely would be detrimental to us but I think there are definitely ways we could make it more privacy-respecting. Would it be acceptable to somehow force telemetry to be crossorigin=anonymous or maybe a different way (HTTP Header or similar) to opt-into getting a set of performance metrics sampling in a privacy-respecting way. We don't care about other entries and how they perform, only our own.

erik-anderson commented 2 years ago

@pes10k is this an area you think folks would like to come back to? There are various APIs that might interact with such an idea; are there any that are actively looking for such a mode?

pes10k commented 2 years ago

I still think this would be a good and useful feature on the Web, but I am not aware of any implementor interest. I can think of a number of features that could hook into it (Reporting API, Resource Timing, Performance API, etc) but the groups authoring those specs do not seem interested either for the most part (though, I believe Yoav expressed they might be open to discussing more at one point, my memory could be wrong…)

Anyway, all that is to say, i still think this would be a good feature for PrivacyCG, but would probably be more successful thought of as coming from PrivacyCG/WG and applied to other specs, than originating from other specs. If other vendors or PrivacyCG members feel similarly though, Id be very happy to work on this

privacycg / proposals

Analytics mode #9