w3ctag / privacy-principles

https://w3ctag.github.io/privacy-principles/
Other
45 stars 14 forks source link

Rewrite the ancillary data section. #359

Closed jyasskin closed 11 months ago

jyasskin commented 11 months ago

Hopefully this satisfies both the WebPerf and Privacy goals...

Sorry to Amy for probably undoing their readability fixes. 🫣

Fixes #220.


Preview | Diff

npdoty commented 11 months ago

220 is an issue about revising the examples in this section, and #221 is a PR that would address that issue.

Is this re-write addressing some other issue? Or a series of issues, given how completely it changes the existing text?

jyasskin commented 11 months ago

Unfortunately, #221 created a conflict with the WebPerf WG. The old text appeared to say that if an API was primarily good for ancillary uses, then it was producing ancillary data, which meant most of the WebPerf APIs produced ancillary data. The example didn't help, since it didn't mention any concrete data. In the cases where WebPerf were just summarizing data produced by other web APIs, they didn't think it was reasonable to "aggressively minimize" the data without being able to use research or heuristics to decide some of it was ok.

@yoavweiss pointed out that it works better to focus on the source of the data, and less on how it's processed or summarized. He pointed out particular reporting APIs that expose new ancillary data, and others that merely summarize non-ancillary data that's available from other sources. That let me improve the example to name concrete telemetry APIs that do or don't return ancillary data. It also let me lay out a decision tree for API designers and reviewers to follow in order to figure out how to treat their new API. Listing the concrete options for API designers did take more space than generally describing what users might want, but I think it's more useful too.

npdoty commented 11 months ago

Could we open an issue that describes the concerns that WebPerf would have, with either the existing text or with the text in #221?

My understanding is that performance analysis was widely agreed to be an ancillary use of data. And that https://www.w3.org/TR/privacy-principles/#information was specifically written, directly after conversations with WebPerf folks, to address the distinctions between data available from other/existing APIs.

jyasskin commented 11 months ago

I've asked @yoavweiss to organize an issue or a comment on the existing issues from WebPerf's perspective, and he thinks he'll get to that tomorrow.

Performance analysis is an ancillary use, but that doesn't make all of its inputs into ancillary data.

The Ancillary uses section seems to be trying to add restrictions on top of https://www.w3.org/TR/privacy-principles/#information. If it weren't adding restrictions, we could delete the whole section. But I think a few additional restrictions are useful, and that the principles I've added in this PR will lead to concrete privacy improvements in some of WebPerf's APIs.

Your original text had to be somewhat vague about this because we hadn't yet figured out how the WebPerf folks were thinking about their designs. Now that we have the distinction that some of their APIs are streamlining existing uses of APIs, while others are providing new capabilities, we can make the text more concrete. We should be willing to make more than local tweaks in that direction.

yoavweiss commented 11 months ago

Thanks @jyasskin!! (and apologies for the delay)

From the WebPerfWG's perspective, the previous principles were vague in ways that made their application open to interpretation, which I felt could lead to future disagreements.

IMO, the issue stems from the definition of ancillary data based on its "primary" use, regardless of the potential reasons to why the data is non-ancillarily exposed. Jeffrey's modifications on that front are very much welcome, as they create a clear deliniating line between data that is already exposed for functional reasons, and data that is only exposed for ancillary uses.

That kind of principle can enable us at the WebPerfWG to clearly mark the APIs we're working on and their different attributes as "ancillary" vs. "non-ancilllary", and think of ways in which we can work on migrating the ancillary data to safer options (e.g. aggregated exposure) without harming the use cases it tackles.

jyasskin commented 11 months ago

Major discussion in https://github.com/w3ctag/privacy-principles/blob/main/meetings/2023-10-04-minutes.md, which I'll try to apply to a new PR.

pes10k commented 11 months ago

@jyasskin adding my three comments here, though happy to move them to the follow up PR too if they're applicable