w3c / reporting

Reporting API
https://w3c.github.io/reporting/
Other
75 stars 36 forks source link

Privacy concerns regarding Reporting API #169

Closed pes10k closed 3 years ago

pes10k commented 5 years ago

I think this looks very promising, and am grateful for ya'll putting it together! I see a couple of privacy concerning issues though, that I'd like to work through / address:

1) I have the same concerns as @annevk and @johnwilander in https://github.com/w3c/reporting/issues/158. Report life times should be tied to the reporting document (e.g. if I panic and think a page is doing something wacky, I should have confidence that the page looses control when i close the tab, etc.) 2) Will reports be exposed to webExtension APIs, for extension controlled blocking and filtering? 3) Many privacy-preserving resource block on the basis of 1p vs 3p communication. There should be someway of mirroring this information to other decision points (e.g. an extension should see both the destination of the report and the source of it, and be able to say yes / no accordingly) 4) Tying valid endpoints to Origin Policy seems a promising long term option, but in the meantime, the Report API should be limited to eTLD+1 (or similar) endpoints, since some reports (e.g. bodies of CSP violations) can be used to share identifying tokens / track cross origin. 5) What information travels with the report, as described in the standard? Most importantly, I can't tell if cookies should be transmitted (and for vendors that double key storage, or otherwise constrain storage, whats the origin of the request)?

igrigorik commented 5 years ago

Thanks for the feedback, inline..

  • I have the same concerns as @annevk and @johnwilander in #158. Report life times should be tied to the reporting document (e.g. if I panic and think a page is doing something wacky, I should have confidence that the page looses control when i close the tab, etc.)

Since we already have a conversation on this in #158, let's continue this particular discussion there.

  • Will reports be exposed to webExtension APIs, for extension controlled blocking and filtering?
  • Many privacy-preserving resource block on the basis of 1p vs 3p communication. There should be someway of mirroring this information to other decision points (e.g. an extension should see both the destination of the report and the source of it, and be able to say yes / no accordingly)

Good questions. We don't currently state anything about extension APIs and the challenge here is that there is no — afaik — shared standard or spec for webExtension APIs across different browsers. That said, Reporting relies on Fetch for delivery and my intuition is that if someone was to attempt defining what such an API should be able to see, it ought to be integrated and done at Fetch layer. For reporting requests in particular we set destination=reporting, so one can detect them as such within Fetch.

  • Tying valid endpoints to Origin Policy seems a promising long term option, but in the meantime, the Report API should be limited to eTLD+1 (or similar) endpoints, since some reports (e.g. bodies of CSP violations) can be used to share identifying tokens / track cross origin.

I don't follow, how does Origin Policy and the limiting to eTLD+1 endpoints relate to each other?

5. What information travels with the report, as described in the standard? Most importantly, I can't tell if cookies should be transmitted (and for vendors that double key storage, or otherwise constrain storage, whats the origin of the request)?

You can see all the details here: https://w3c.github.io/reporting/#try-delivery. Re, cookies: see https://github.com/w3c/reporting/issues/161.

pes10k commented 5 years ago

Good questions. We don't currently state anything about extension APIs and the challenge here is that there is no — afaik — shared standard or spec for webExtension APIs across different browsers. That said, Reporting relies on Fetch for delivery and my intuition is that if someone was to attempt defining what such an API should be able to see, it ought to be integrated and done at Fetch layer. For reporting requests in particular we set destination=reporting, so one can detect them as such within Fetch.

This is all very helpful, thank you @igrigorik . Another key diff here is that since messages are sent in POST, most blocking tools (especially post manifest v3) will loose the ability to reason about these. What about a destination=reporting:csp, destination=reporting:nel, etc designation, and making sure that this info is available to blocking tools too.

Tying valid endpoints to Origin Policy seems a promising long term option, but in the meantime, the Report API should be limited to eTLD+1 (or similar) endpoints, since some reports (e.g. bodies of CSP violations) can be used to share identifying tokens / track cross origin.

I don't follow, how does Origin Policy and the limiting to eTLD+1 endpoints relate to each other?

Apologies, thats me being a dummy. I didn't mean origin policy, I meant something like https://mikewest.github.io/first-party-sets/ . No excuses on my end for the error, I don't know what I was thinking :)

yoavweiss commented 5 years ago

This is all very helpful, thank you @igrigorik . Another key diff here is that since messages are sent in POST, most blocking tools (especially post manifest v3) will loose the ability to reason about these. What about a destination=reporting:csp, destination=reporting:nel, etc designation, and making sure that this info is available to blocking tools too.

I don't know if Fetch would be thrilled to introduce the concept of "sub-destinations". @annevk - thoughts?

Regarding availability of destination to extensions, see above :)

Apologies, thats me being a dummy. I didn't mean origin policy, I meant something like https://mikewest.github.io/first-party-sets/ . No excuses on my end for the error, I don't know what I was thinking :)

Can you describe the information leak scenario you're concerned with in a bit more detail?

pes10k commented 5 years ago

Regarding availability of destination to extensions, see above :)

I see in the above that "its not specified anywhere" currently, which isn't ideal, but at the very least, it would be useful to know what the plans are for major implementors, to reason through the privacy implications of Reporting API vs <img> or similar.

Can you describe the information leak scenario you're concerned with in a bit more detail?

Since Reporting API would give privacy / blocking tools less info to work with (or at least that seems like a possibility), and potentially open up new types of privacy sensitive info (e.g. nel) it would be ideal to balance this loss by restricting / preventing cross domain communication.

E.g. prevent (or at least slow) the creation of a 3p "track users through network information" service, or the translation of existing tracking services to Reporting API endpoints.

(I dont know if i've answered your question, happy to take another pass at if if not…)

igrigorik commented 5 years ago

I meant something like https://mikewest.github.io/first-party-sets/

Ah, hmm.. worth exploring. Off the top of my head, a few caveats:

clelland commented 5 years ago

That doc seems to have moved; it's at https://github.com/krgovind/first-party-sets now.

It would be difficult for a reporting service provider to use that, though. It doesn't seem to be possible for an origin to be in more than one set, for one. The provider could try to own a single set containing all of its clients, but there is also a limit suggested of 20-30 origins per set, and it would also tie each of the clients to the others as part of the same set.

CNAMEs would allow a provider to get around that (either at the provider or client DNS), but as @igrigorik says, I don't know if we want to encourage (or require) that as a way around this.

dcreager commented 5 years ago

Reporting uploads are subject to CORS checks (comparing the origin of the report and the collector), so the user agent will have to send out a preflight request for 3p report uploads before POSTing the actual report content. Does that give us what we need for this?

clelland commented 5 years ago

I think that CORS is sufficient for (3), and I hope mitigates (4) -- with CORS indicating cooperation between the site and the endpoint. Tying it to domain name will just force people into CNAME tricks, without improving anything substantially.

For (1), I raised a proposal at TPAC to tie reports and reporting configuration for Crash, CSP, Deprecation, Feature Policy and Intervention reports to document lifetime. Network Error Logging and similar out-of-document reports have different requirements, but the plan is to start a separate document to standardize their behaviour.

I think @igrigorik and @yoavweiss covered (2) and (5)

pes10k commented 5 years ago

Re (1), that sounds great. Tying to document life time would be a strong improvement

Re (3) and (4), I don't think CORS addresses the concern, which is generally about restricting where this information can travel to a set of parties the user can reason about (since all this functionality involves the site riding on the user, to help the site owner achieve the site-owners' goals and responsibilities).

We discussed briefly at TPAC PING the idea that sites would need to explicitly request the user's permission to use reporting API (e.g. "would you like to submit diagnostics info to {sites,urls} X,Y and Z to help improve the site?"). That would of course mitigate this concern.

yoavweiss commented 5 years ago

We discussed briefly at TPAC PING the idea that sites would need to explicitly request the user's permission to use reporting API (e.g. "would you like to submit diagnostics info to {sites,urls} X,Y and Z to help improve the site?"). That would of course mitigate this concern.

Any particular reason why reporting requests are different from other cross-origin requests when it comes to user expectations? Are there implementations that are considering gating cross-origin requests behind a permission?

/cc @jyasskin @bslassey

annevk commented 5 years ago

I would also like to learn more about the threat model, in particular how it pertains to reporting that is scoped to document(s).

pes10k commented 5 years ago

1) some of the types of information i the Reporting API spec do not have non-Reporting API analogues (intervention, crash, etc). Doubly true for the types of information used in examples in Reporting API (e.g. nel) 2) Re user expectations: im sure most users have no real sense of whats going on behind the scenes, and so no specific "expectations" at all. But they do expect their browser to be their agent, helping them do things they want to do, for their benefit. All (or nearly all, if you're willing we are to conflate website interests with user interests) the use cases for Reporting API do not help the user accomplish the user's first order goals on the website.

So its not a "threat model" question, it's whether its appropriate to for the site to treat the user as the site's debugging agent w/o the user's consent, and especially when some of that debugging activity may be harmful to user interest / privacy (re: intervention, nel, etc)

annevk commented 5 years ago

Thanks, that's helpful.

mconca commented 5 years ago
  1. Re user expectations: im sure most users have no real sense of whats going on behind the scenes, and so no specific "expectations" at all. But they do expect their browser to be their agent, helping them do things they want to do, for their benefit. All (or nearly all, if you're willing we are to conflate website interests with user interests) the use cases for Reporting API do not help the user accomplish the user's first order goals on the website.

I like how this is stated. If I look at the deprecation report, that tells the site that it's using a feature that my browser will soon no longer support. That feels very much in the spirit of user agency, and something I'd want sites to know about.

Conversely, the intervention report does not feel like it advocates for anything on the user's behalf. The example given in the spec, reporting that a request to play audio was blocked due to a lack of user activation, feels like my browser just betrayed me to the site. Will we see sites abuse the Reporting API and put up a paywall (or other annoyance) if I don't let audio or video autoplay?

Does a crash report advocate for the user? Maybe? It seems more important for the browser vendor to know about this, but I suppose a site that sees a spike in crash reports after an update (for example) might be prompted to look into it and/or fix things more quickly.

pes10k commented 5 years ago

Re deprecation reports, since we're in a world where there are 3 popular browser runtimes, site authors could just as easily get this information themselves though, no? Or via linting or bots or etc etc etc. Reporting API / additional network behavior seems like an unnecessarily chatty way of fixing that problem.

yoavweiss commented 5 years ago

I think this discussion is conflating/merging a few different questions:

1) Do we think that non-persistent reporting API requests should be treated differently than other resource requests? 2) Do we think that information regarding interventions, crashes, deprecations should be gated behind a user permission? 3) (higher scope question) Do we think that specifications should recommend specific gating mechanisms (e.g. permissions, user activation, PWA install, etc)? Or is that something we can leave for implementations to decide based on their knowledge of their users, experimentation with different gating mechanisms, user preferences, etc?

Regarding (1), I think the answer is unequivocal "no". If we (as we plan) want to use the API for e.g. CSP violation reports that are also available from JS, placing any restrictions on reporting API generated requests will just cause folks to gather that info in JS and use Fetch to send it to whatever report collector they may choose. That's not something we necessarily want to encourage, and above all, it makes no sense to go down that route.

Regarding (2), I think we can discuss if access to each piece of information (through both Reporting API requests and ReportingObserver) is required to create improved user experience. I'd argue that at least some of them are not something that application developers can achieve through lab tools.

Regarding (3), I think it would be interesting to have a broader discussion on that point, including PING, TAG, relevant WGs and browser vendors.

pes10k commented 5 years ago
  1. My concerns with the (now reduced) reporting API have mostly to do with the new types of information involved. W/o new types of info, its mostly a NO-OP (once the reports' lifetimes are document scoped). E.g. if the standard removed all the report types (and discussion of this standard was divorced from other standards that'd like to use it in the future, e.g. CSP, NEL), would it still be useful? And under which scenarios?

  2. Sure, I expect this is where the main disagreement is :). It would be helpful to know which types of values in the Reporting API spec the authors believe require universal, non-opt-in availably to solve their respective problems. In other words, instead of a report-uri-like solution (where every browser reports to a single place), which problems couldn't be solved through bots, automation, or a small number of human debuggers.

  3. I think at least standardizing which APIs are available to sites by default, and so sites should generally expect to be available, and which sites should not expect to be available, is something that needs to be in the standard. I also think standards need to describe when users need to be explicitly asked, vs when the platform can "infer" consent is important, to ensure a cross platform, web-is-private-by-default situation. But I also agree this is a higher scope issue, so I vote for removing it from this issue and finding a better home for it.

annevk commented 5 years ago

Perhaps "Report Types" is best split from this document after all as these are quite novel reporting use cases that each deserve their own scrutiny.

Of those, Mozilla is interested in crash reporting, but we would perhaps require user consent in the crashed screen similar to how we require that for reporting such information to Mozilla today.

yoavweiss commented 5 years ago
  1. My concerns with the (now reduced) reporting API have mostly to do with the new types of information involved. W/o new types of info, its mostly a NO-OP (once the reports' lifetimes are document scoped). E.g. if the standard removed all the report types (and discussion of this standard was divorced from other standards that'd like to use it in the future, e.g. CSP, NEL), would it still be useful? And under which scenarios?

The immediate case for the API's usefulness is CSP reporting. Feature Policy reporting also seems highly valuable and will relies on the API (but it is to some extent "new information", if you consider Feature Policy to be new).

2. Sure, I expect this is where the main disagreement is :).

Probably. I think that in order to avoid confusion, it might be best to split those out to a separate proposal that relies on this one. Then we can discuss privacy implications vs. utility for each one of those data types, while making it clear that the Reporting API infrastructure is indeed something that we can move forward with.

@clelland, @igrigorik - thoughts?

It would be helpful to know which types of values in the Reporting API spec the authors believe require universal, non-opt-in availably to solve their respective problems. In other words, instead of a report-uri-like solution (where every browser reports to a single place), which problems couldn't be solved through bots, automation, or a small number of human debuggers.

Again, I think it makes sense to discuss these questions separately for crashes, deprecations and interventions, as I suspect the answer may vary widely between them.

3. I think at least standardizing which APIs are available to sites by default, and so sites should generally expect to be available, and which sites should not expect to be available, is something that needs to be in the standard. I also think standards need to describe when users need to be explicitly asked, vs when the platform can "infer" consent is important, to ensure a cross platform, web-is-private-by-default situation. But I also agree this is a higher scope issue, so I vote for removing it from this issue and finding a better home for it.

Can I take that as you withdrawing your proposal for gating the Reporting infrastructure behind a permission until that higher level question is further discussed?

@cwilso what would be the best venue for such a discussion that spans the PING, TAG, WG chairs and WG members? We'd like to determine if it makes sense for specifications to include specific mitigations when it comes to security and privacy, and if so, how those mitigations are decided upon.

clelland commented 5 years ago

Yes, splitting those out might make sense, leaving Reporting to just define the infrastructure. What's the right place for deprecation/crash/intervention reports? HTML? Another document in this repo? Somewhere else entirely?

dcreager commented 5 years ago

Yes, splitting those out might make sense

There was some previous discussion about this in #80 and https://github.com/w3c/reporting/pull/60#issuecomment-362869590. At the time, we decided to include everything in a single spec to "keep overhead low", though tbh I've always been more of a fan of the modularity that we'd get from separating them out.

annevk commented 5 years ago

Three separate incubation repositories makes the most sense to me at this point given the levels of interest expressed and there not being much overlap in the topics. I could see crashing ending up being defined by HTML at some point (seeing how it manages agent cluster allocation).

vdjeric commented 5 years ago

Of those, Mozilla is interested in crash reporting, but we would perhaps require user consent in the crashed screen similar to how we require that for reporting such information to Mozilla today.

What would be the privacy rationale for requiring user consent for merely reporting that a tab crashed, and that the type of the crash was an "oom"? Mozilla's browser crash reporting gathers stacks and other low-level user information which are very privacy-sensitive, very much unlike the 2 bits of Reporting API info. What is the threat model to a site being able to measure changes in crash rates reliably?

pes10k commented 5 years ago

Setting aside the possible privacy harms from learning about hardware capabilities, that this could be used to circumvent all sorts of FP surface reduction (e.g. if UA gets frozen, stuff like this might be able to pull back out UA version info, etc)… What frequent, user-directed goal does this information serve that couldn't be gathered in more privacy respecting manner? If sites are worried they're going to be causing OOM errors on their users, that seems like something site owners could easily detect with automated testing, asking users to opt in info, or just dogfooding

Mozilla's browser crash reporting gathers stacks and other low-level user information which are very privacy-sensitive

Sure, but Mozilla doesn't send it to arbitrary 3rd parties! Theres not a parallel between the trust a user puts in a piece of software they download, install and use every day, and a website they visited where they're information went who-knows-where (from the user's perspective)

vdjeric commented 5 years ago

What frequent, user-directed goal does this information serve that couldn't be gathered in more privacy respecting manner?

Detecting memory-use regressions on a site that has an infinite number of user data and code variations. We are running N simultaneous A/B experiments where N is very large, with newsfeeds and chat messages that are unique to each user. Lab testing cannot catch these regressions. This isn't a theoretical, I've been dealing with hard-to-debug Facebook.com memory leaks for a year now. I also wrote the memory use lab tests, they are not enough. In fact, this is why Mozilla and Google collect crash dumps from the field instead of just relying on lab testing.

And I still don't see how reporting that a tab OOMed is not privacy respecting.

Sure, but Mozilla doesn't send it to arbitrary 3rd parties!

Mozilla collects crash information from 3rd party sites (e.g. banking) and sends it to themselves.

pes10k commented 5 years ago

Detecting memory-use regressions…

If you'd like to recruit users to help you debug theoretical, rare cases (the vast vast vast majority of websites don't have to worry about OOM, of course), then surely the polite, privacy respecting thing to do is to ask them. i.e. permissions

Mozilla collects crash information from 3rd party sites (e.g. banking) and sends it to themselves.

Again, users trust mozilla, they don't trust arbitrary websites. unless something is very different my understanding, Mozilla isn't sending crash information to third parties (is this incorrect?)

vdjeric commented 5 years ago

I worked on Telemetry at Mozilla and occasionally touched crash reporting, and I can tell you OOMs in sites are common. In fact OOMs were the top reason for crashes for a long time.

vdjeric commented 5 years ago

I would also add that memory leaks are in fact extremely common in JavaScript and this is why major browsers ship some great developer tools for debugging them. None of this is theoretical.

vdjeric commented 5 years ago

Here is Mozilla requesting our help investigating a large amount of content OOMs on Facebook.com that they can't reproduce in the lab:

https://bugzilla.mozilla.org/show_bug.cgi?id=1584266 Literally can't reproduce in the lab: https://bugzilla.mozilla.org/show_bug.cgi?id=1584232#c8

These crashes hurt real users. If we had a means to correlate them to specific code releases or experiments, we would be able to fix them sooner which would benefit both the site and users. The 2 bits of information sent (crash: yes/no, oom: yes/no) have zero privacy tradeoff and help browser, site and user.

pes10k commented 5 years ago

i'll stop now since it seems like there is rough agreement to split the types of reports from the reporting API discussion in general, and this might all be moot anyway :)

yoavweiss commented 5 years ago

I think it might be better to take this discussion to the related crash/oom report repo, once one is created. I have many opinions on that front, but I'll save them for later :)

clelland commented 4 years ago

FYI -- the new repos are up now -- Crash reporting (spec) Deprecation reporting (spec) Intervention reporting (spec)

igrigorik commented 4 years ago

Awesome work Ian, thanks for splitting these up! On a quick pass..

yoavweiss commented 3 years ago

@clelland - Can we maybe open separate issues for the remaining work from https://github.com/w3c/reporting/issues/169#issuecomment-568517978 (if we haven't already) and close this one?

clelland commented 3 years ago

I'll open new issues in the the three related specs about processing, and one here to ensure that the requirement to declare observability is a MUST for any specs integrating.

clelland commented 3 years ago

The other issues raised here, namely:

can be covered by specific issues. If there are others, let me know, and we can open additional issues for discussion.

pes10k commented 3 years ago

Before this is closed, there are privacy aspects of the reporting API that are independent of the types of reports. If these have been dealt with elsewhere im happy to have this issue closed out, but i just want to make sure they're not lost:

annevk commented 3 years ago
clelland commented 3 years ago

Re: Lifetime: with this spec, for reports generated as part of documents, the intention is that the report lifetime is tied to the document lifetime. Endpoint configuration data is tied strongly to the document, it is not persisted past that, or used for any other documents. The reports themselves should be sent reasonably quickly, and I believe that @annevk suggested that the timing should match that of fetch.keepalive at the latest.

Reports can be sent to anywhere that follows CORS protocols -- same-origin has no issues, same-site or further requires CORS opt-in by the receiving party, which should match the behaviour of subresource requests, or any other way that this data could be exfiltrated if reporting was unavailable.

clelland commented 3 years ago

(Thanks, @annevk -- that was way more concise than me :) )

pes10k commented 3 years ago

@annevk Re document lifetime, is this no longer correct then? Or am i reading it incorrectly? It seems to say reports can be sent from service workers too, and so not document length. Apologies if I'm misunderstanding here.

re "anywhere", this is still very surprising to me. All the motivations above have been "it'd be useful for facebook to know if users are out of memory on facebook" or "it'd be useful for facebook to know that some feature facebook uses may not be supported in the future, etc". Bracketing whether its good / bad / whatever to send the information to facebook, it seems even less compelling to say "it'd be useful for facebook to know if users are out of memory, so im sending the data to someone who's not facebook".

Anyway, all that is to say, I understand they're currently defined in terms of fetch(), but I'd be grateful to hear more about the motivation for why Reporting API shouldn't be limited to the same site.

clelland commented 3 years ago

Allowing reports to be sent to third parties enables several things:

neilstuartcraig commented 3 years ago

Just to add to @clelland point above, we host our reporting endpoint on a third party domain since it runs on GCP Functions so "third party" isn't always a clear-cut distinction. We use GCP Functions for several reasons but one of them is that it means that even if all of our DNS is broken, we can still receive reports (unless Google's DNS is concurrently broken).

annevk commented 3 years ago

@pes10k service worker lifetime is generally bound to the document. In Firefox that is not true for push notifications at the moment, but we are considering changing that. Letting scripts run in the background was a mistake and we don't plan to expand on that.

I don't really see what a same origin/site restriction would buy you here. The information that can be collected should generally be information that a site can already collect and share with whoever using fetch(), so it's mainly a developer ergonomics feature. Information that goes beyond that deserves a lot more scrutiny, but again that seems irrespective of where it is transmitted.

pes10k commented 3 years ago

@pes10k service worker lifetime is generally bound to the document. In Firefox that is not true for push notifications at the moment, but we are considering changing that. Letting scripts run in the background was a mistake and we don't plan to expand on that.

thanks @annevk , I definitely agree to with the above! If the intent is to tie these to document length though (which seems good to me, since it allows users to "pull the plug" if they don't trust the site), would yall be open to just changing things to make them explicitly tied to the document length?

re same site vs everywhere (cc @neilstuartcraig @annevk @clelland ) thank you for the details here. I still think there is valid reason to distinguish where this data can be sent, since its primarily focused on helping the site and not the user. But I appreciate the points you've made, and of the concerns I've mentioned in this issue and others, its the one i feel least strongly about, so im happy to drop it.

annevk commented 3 years ago

Well, the service worker can do its own reporting, as can other workers. It makes sense for that to be bound to their lifetime as they can close sooner than the document. And in case of shared workers they could outlive a document as long as the documents they are bound to are same origin. I think in case of lifetime concerns it's better to address the root cause. It seems unlikely implementations will get the infrastructure right otherwise. And it's unlikely to help, as those workers could just do their own fetches.