w3c / reporting

Reporting API
https://w3c.github.io/reporting/
Other
75 stars 35 forks source link

Reporting API should be opt-in #168

Open pes10k opened 5 years ago

pes10k commented 5 years ago

The Reporting API is distinct from most core, existing browser functionality in that it principally benefits the site operator. It is asking web users to help the website identify errors and problems in the site owner's application (e.g. to have users serve as debugging and monitoring agents for the site owners).

This is useful to site owners, who will can offload monitoring costs to users and be notified of conditions the site owner might not anticipate. It may be useful for web users, as a group (the indirect upside of bugs and attacks being identified sooner; the downsides of bearing the burden of monitoring the site on behalf of the site, and possible privacy concerns). It is very unlikely to be useful, at the margin, for any single web user.

The Reporting API therefor should be treated as a benefit the client provides to the website; it should require explicit opting in on the part of the client, globally, and with per-origin exceptions, though a permissions like system.

arturjanc commented 4 years ago

I'd like to offer a different perspective on this based on our work on deploying security mechanisms such as Content Security Policy.

In practice, it is extremely difficult for web application authors to enable these mechanisms due to significant potential for breaking existing functionality. Without reporting capabilities that allow developers to have some degree of certainty that their application will keep working when a security feature is enabled, rollouts become either very slow (developers need to enable the feature for a small subset of users, wait to see if it results in bug reports, increase rollout percentage, repeat), or -- in the worst case -- they stop being possible for fear of causing major, hard-to-diagnose breakages. We've seen product teams unhappy enough with issues encountered during non-monitored rollouts of CSP to the point of not wanting to continue deploying security features.

For a similar reason it also generally isn't sufficient to receive reports from a subset of clients: if there is a group of users for whom the application is breaking, the developer generally needs to know about this or otherwise they can't trust their telemetry.

As a result, in the absence of reporting, many large web applications would likely not enable platform security features, leaving users exposed to XSS, XS-leaks, and other endemic web bugs. This would likely set back not just web security, but also privacy because many of the security features serve a dual purpose of preventing websites from revealing sensitive information about the user (e.g. Cross-Origin Opener Policy prevents leaking window.frames.length which can disclose if a user is logged into another site).

I feel like having reporting available on an opt-in basis would likely be a net loss for user privacy and security.

samuelweiler commented 3 years ago

@arturjanc Inspired by a discussion re: the WG's rechartering: can you imagine a way to report this through a privacy-preserving telemetry system like Prio, to isolate the data from potential user identification?

arturjanc commented 3 years ago

I'll add a few security-related comments pertinent to the charter review issue, because this issue one like a better fit for a technical discussion (the charter review also has some conversations around W3C process, which I'm not fit to comment on).

  1. In general, the bulk of the information provided by the Reporting API is already available to the document responsible for generating the report. For example, the contents of a CSP, COEP or COOP violation report are something that the document can determine on its own via client-side checks; so from a security point of view, there is little additional data that the site operator receives as a result of the browser sending the report.

    • There are some exceptions to this, such as Network Error Logging, which may report information that developers couldn't otherwise obtain, for example because the document in question doesn't load. Also in some cases there may be values in the report that don't directly map to explicit values available to the document (say, a timestamp denoting when the violation occurred); in these cases, we have to take the approach outlined below.
  2. If any of the reporting features expose more information about the user or about cross-origin data than is available via other means, we should treat this as a bug in the individual spec and figure out the right solution. This could mean removing or reducing the granularity of the data, or, possibly, evaluating its sensitivity and deciding that it doesn't pose a security/privacy to the risk to the user and that, on balance, it's worth providing this information to site operators. Practically, we have to do it on a case-by-case basis, weighing the sensitivity of the data included in the report and its value to developers.

Re: the specific question of using something like Prio (or equivalent system like RAPPOR), I'm afraid that this isn't a great fit here, for two reasons. First, these systems are meant to report global data to a trusted party (the browser vendor), which is a different model than what the Reporting API operates under: giving each site information about a specific problematic pattern to allow developers to debug it. Second, the value of reports sent via the Reporting API is that it allows tracking down specific issues in the application; this generally doesn't require revealing more information than is already available to the developer, but it does depend on delivering non-aggregated reports with actionable information.

Because of this, I think our focus should be on (2) above, i.e. preventing reports from revealing information that the site couldn't otherwise get, and/or aligning this information with other security/privacy boundaries (e.g. removing some information from reports sent in third-party contexts if that's consistent with browser logic for 3p state). We need to do this anyway, to ensure reporting functionality doesn't result in cross-site information leaks; I'm somewhat doubtful that we can come up with a completely new alternative model here.

pes10k commented 3 years ago

I think there are two topics in the thread:

  1. What should the role of consent be, for a feature that (first order) is primarily aimed at helping websites, not the person owning the machine the code / feature is running on. This seems especially to be the case bc all the given examples either seem designed to reveal new information to websites users have good reason to not give in many cases (NEL, intervention, deprecation reports), that sites could discover on their own (CSP errors), and / or give natural occasions to ask consent (crash reporting).

  2. If there is going to be reporting, what is the proper level of granularity or privacy protection applied to those reports (@samuelweiler Q).

Before typing even more and wasting folks' times, I want to make sure I have a correct understanding. I'd be very interested in the proposers' thoughts on the following questions:

  1. Is there data / evidence you can share for the claim "…in the absence of reporting, many large web applications would likely not enable platform security features, leaving users exposed to XSS, XS-leaks, and other endemic web bugs."

  2. Am i summarizing correctly that the argument against consent to be a combination of:

    • if we asked consent, we don't expect enough people would say yes for the feature to be useful, and
    • much (though not all) of this information is already available through other approaches that don't involve consent, so there is no "marginal consent loss" by not asking consent here
    1. Can the proposers explain in more detail why bots / automated crawling / static checking tools would be insufficient for the intervention and deprecation cases? To my mind, these seem like they could easily be achieved by static analysis, or other approaches that don't involve users' machines.

    Thanks!

dcreager commented 3 years ago

seem designed to reveal new information to websites users have good reason to not give in many cases (NEL)

Can you elaborate on this? We've tried to design NEL very carefully so this is not the case, as called out in the spec's privacy section:

To prevent information leakage, NEL reports about a request do not contain any information that is not visible to the server when processing the request. For errors during DNS resolution, a NEL report only contains information available from DNS itself. This prevents servers from abusing NEL to collect more information about their users than they already have access to.

As an example, NEL reports specifically do not contain any information about which DNS resolver was used to resolve a request's domain name into an IP address.

If there's information in a NEL report that isn't already visible to the server while processing a successful request, then that's an bug in the NEL spec that we want to fix.

pes10k commented 3 years ago

@dcreager Sure! Here are some examples that come to mind (if any of these reflect a misunderstanding on my part, apologies!). But here are some examples that come to mind where NEL leaks new privacy-relevant information to pages:

Case 1:

  1. User does some normal browsing and visits, say, social.com.
  2. User wants to do some privacy stuff, so they do something that attempts to do privacy protections on the DNS level (or, alternatively, the person is in an area that does anything DNS level that they don’t want the site to know about). Say the DNS policy blackholes tracker.social.com
  3. User turns on their VPN, changes to the privacy preserving DNS resolver, etc. report-uri now starts getting dns.name_not_resolved errors for when social.com includes “tracker.social.com/pixel.jpg?user_id=X”

Case 2: There might be a moment where I turn off a blocking extension (or in Brave, shields) to unbreak a website, so I pull in a policy on a resource then. When I later re-enable the extension, I don’t want to report my new blocking behavior for a page that was temporarily unblocked.

Case 3: More broadly, NEL reports basically become referrer headers for all failed requests (i.e. I tried to fetch your thing from this other page). Since browsers are trying to reduce or remove referrer information from requests in other parts of the platform, so having it mirrored here is not ideal.

Happy to elaborate more, but will pause to make sure i'm not operating from some very wrong understanding!

arturjanc commented 3 years ago

Is there data / evidence you can share for the claim "…in the absence of reporting, many large web applications would likely not enable platform security features (...)

There is a fair amount of information to support this in the form of technical posts by large application developers discussing their process of adopting web security mechanisms. For example, in the case of Content Security Policy:

I chose CSP because it's the best-known feature which has reporting, but the same is true for other security mechanisms. See, for example, recent feedback from Facebook security folks on the Cross-Origin Opener Policy reporting API intent to ship:

We've been experimenting with this feature already on facebook.com and instragram.com and the reporting is incredibly useful feature for us as it allows us to reliably roll out COOP at scale. Since other browsers haven't offered similar functionality yet, it's essentially the only way we can test the impact of COOP enforcement without breaking our sites.

Finally, I'm not sure if a data point from me counts for much, but all of the recent deployments of web security features at Google (specifically, the ones described in this post: CSP, Trusted Types and COOP) rely heavily on reporting data.

Am i summarizing correctly that the argument against consent to be a combination of:

  • if we asked consent, we don't expect enough people would say yes for the feature to be useful, and
  • much (though not all) of this information is already available through other approaches that don't involve consent, so there is no "marginal consent loss" by not asking consent here

I have to say I take issue with framing this as an "argument against consent" because this implies that, by default, the use of run-of-the-mill web APIs should be subject to user consent, and there is something odd about shipping web features without requiring it. This doesn't seem to match the model under which the web operates.

Consider features such as the onload event handler, the Fetch API, or background-color in CSS. They all match the criteria you outlined: if we asked users to explicitly consent to their browser's use of these APIs they would likely not understand what this means for the website they're visiting, and each of these mechanisms can be polyfilled by developers by using less convenient features which existed before it was introduced (setTimeout, XMLHttpRequest/script#src and bgcolor respectively). Would we consider the lack of consent to a website's use of these APIs problematic?

Instead, I'd phrase it as an "argument for reporting" which is based on the following:

This is the bar we apply to most other features, so I think it's reasonable to also apply it to the Reporting API. I'll go even further and say that because of the security improvements that reporting enables, we should be open to providing developers more reporting data to help them secure their sites, as long we can do so safely.

Can the proposers explain in more detail why bots / automated crawling / static checking tools would be insufficient for the intervention and deprecation cases? To my mind, these seem like they could easily be achieved by static analysis, or other approaches that don't involve users' machines.

I don't think I can do justice to this question here because there is just an extremely large number of use cases for this kind of telemetry. The simplest way I can put this is that a web application is code written by the developer that runs in an environment controlled by the user, and to understand and debug issues which the user encounters the developer needs to get a glimpse into the real, non-simulated operation of their application as experienced by the user.

This includes information about the network conditions (perhaps some resource loads are failing or slow for a particular segment of users), local browser configuration (users have extensions that can interfere with the operation of a website and trigger security violations), the specific functionality that the user was interacting which triggered the report (crawling modern web applications reliably is an unsolved problem), transient server failures that you can't test for, etc. These are crucial issues for complex web applications which you can't identify via static analysis or in a staging environment.

Or, maybe put another way, developers wouldn't be so vocal about the importance of browser-based reporting if they could just crawl their sites instead :)

arturjanc commented 3 years ago

One additional aspect that I wanted to (separately) comment on is the following:

What should the role of consent be, for a feature that (first order) is primarily aimed at helping websites, not the person owning the machine the code / feature is running on.

I see this as a bit of a false dichotomy because it assumes that websites gather telemetry for their own gain, which doesn't translate into a benefit for the user. In practice, pretty much all of the telemetry is used by websites in order to realize some clear benefits for the user: enable security features, improve reliability or performance, identify breakages, etc. Both the website and the user share the goal of having the website run reliably and quickly in the user's browser and handle their data securely, and reporting is a means to that end.

Continuing the analogy from above, the user doesn't directly benefit from the Fetch API being available in the web platform -- the user literally does not care :) But Fetch helps developers write applications more easily and to avoid security issues that they'd invariably run into if they built custom workarounds for the lack of this functionality in the platform. I think it's important to keep this in mind, because the Reporting API is just one in a long line of web platform features that work this way.

dcreager commented 3 years ago

Sure! Here are some examples that come to mind

Might be worth moving this part of the discussion to a new issue in w3c/network-error-logging so we don't clutter things, but some short responses here:

Case 1: Case 2:

These both seem like examples of what we've tried to cover in this part of the (now renamed) Network Reporting spec. Chrome's current behavior follows the suggestion — policies and reports are both cleared whenever the user agent detects that the network configuration has changed. I would be on board with promoting that part of the spec text to a requirement.

A wrinkle, though, is that it depends on the user agent being able to detect the network configuration change. If it happens completely off to the side, or upstream, then it will be harder to mitigate.

Case 3: ...I tried to fetch your thing from this other page...

If I'm following right, then you can only see the "from this other page" part in a NEL report's referrer field, just like how the server could only see this via the Referer header. And the NEL referrer field must be filled in according to the user agent's referrer policy. If the user agent tightens that policy to not include referrer information in the requests themselves, those restrictions are meant to automatically propagate to the NEL reports about those requests.

(This is a common confusion about NEL. If a page at foo.com includes resources from bar.com, the foo.com NEL policies do not collect any information about the requests to bar.com, and the bar.com NEL policies don't collect any information about the enclosing foo.com context — apart from the referrer field — in which those requests to bar.com were made. Just like how the foo.com server sees no evidence of the bar.com requests being made, and the bar.com server sees no information — apart from the Referer header — about how the bar.com requests were because of some content received from foo.com.)

dcreager commented 3 years ago

policies and reports are both cleared whenever the user agent detects that the network configuration has changed

In particular, for Case 1, where the user turns on a VPN and starts using a DNS resolver that blackholes tracker.social.com:

pes10k commented 3 years ago

Thanks @dcreager this is very helpful!

Might be worth moving this part of the discussion to a new issue in w3c/network-error-logging

Happy to do so if you'd prefer :)

I would be on board with promoting that part of the spec text to a requirement.

That sounds terrific!

Depends on the user agent being able to detect the network configuration change

We may be agreeing, but I think this is a difficult, significant constraint. There are all sorts of things that not quite full network config changes (that'd result in an OS or browser triggered signal, like changing a network adapter, hotspot, etc), but will result in increasing or decreasing network level privacy defenses. Extensions are just one of these. Depending on how the browser is checking, adding or removing entires to /etc/hosts is another. Changing settings on privacy preserving middleware (pi hole) a third. There are many such cases.

This is a common confusion about NEL

I am very happy to accept responsibility of the confusion 😅

But I think i didn't quite make the claim well. Let me express the concern more specifically, and if im still off in the wrong direction, i'd appreciate the clarification!

  1. I pick up a NEL policy from Facebook
  2. I visit chicagotribune.com which includes a facebook tracking pixel that identifies chicagotribune.com
  3. Using whatever approach, i block the 3p request to facebook, issued on chicagotribune.com
  4. facebook.com now knows that I was visiting chicagotribune.com from the NEL report

In the absence of NEL, Facebook wouldn't have known i visited the chicagotribune.com

Is this incorrect? I tried to re-read the updated / renamed version you pointed to but maybe its under dev still; i didn't see the list of NEL / Network Reporting types I think i remember. Apologies if I'm missing them, or looking at the wrong version of the proposal.

clelland commented 3 years ago

That document is certainly still under development -- the idea is that it is an extension to the reporting spec that allows things like NEL to be built on top of it. It contains all of the things that we pulled out of the base Reporting API, like endpoint-groups with failover, out-of-band configuration, and the cache of reports which can outlive individual documents.

NEL is still at https://w3c.github.io/network-error-logging/ and hasn't been updated to use that as underlying infrastructure yet. It's on my list of tasks to take care of in the new year.

pes10k commented 3 years ago

@clelland okie dokie, thanks for the update! It looks like NEL hasn't changed since i reviewed it last, so i think my example above would still apply, though if I am wrong, i'd be grateful for the correction.

clelland commented 3 years ago

@pes10k, So yes, in that case, with the spec as written, it might be possible for Facebook to learn something, but it depends on other factors, I think.

The scenario you're presenting reads to me like this:

I'm sure that even without NEL, the conflict between the first two points sets up an arms race, but we should do what we can to make sure that NEL doesn't tip the scales there.

You're suggesting that if the browser sees the intention to load the tracking resource, but it doesn't happen, then NEL would cause the browser to report to facebook that something was wrong; that in that case, Facebook would be using NEL to circumvent the ad blocker/anti-tracking tech. That's not necessarily true, especially once we start talking about ad blockers, and other tech that exists outside of the realm of web platform standards.

There's a lot resting on the "whatever approach" in your step 3 — I think that the results are very different depending on how exactly the tracker is blocked; without talking about that, it's hard to say what the results could be, or what additional protections might be needed. I can imagine at least these different scenarios:

Obviously those are approaching the ridiculous by the end, but there are clearly a large number of points at which the requests can be blocked. In some cases, requests are legitimately failing due to infrastructure issues, and ought to be reported. In other cases, the browser can know that the request was never intended to be sent, and should have no reason to report a network error when it isn't. I would expect that any requests which the browser can determine are unwanted would not ever be seen by NEL, and that anything further out into infrastructure may be indistinguishable from network damage, from the point of view of either the browser or Facebook.

I don't see any reason why an extension which was working with the browser to block unwanted content wouldn't be able to make it so that the requests just "didn't happen" as far as NEL is concerned. (All of this may be out of scope for the spec, though, similar to #223)

Outside of extensions, a more general approach is to treat the NEL configuration and reports the same way that we do other third-party subresources, and isolate them appropriately, which could make the entire scenario much less likely. Chrome is intending on using the network partition key from Fetch to isolate NEL configurations from each other, so that the NEL policy picked up in step 1 wouldn't apply to the request in step 3. (A different NEL policy could apply, but if the requests are always blocked in that scenario, then one would never be installed). Also, I expect that blocking or similarly isolating third-party cookies also makes the tracker less effective, as no useful credentials would be sent in the NEL report in any case. (The cookie policy for the NEL report should be the same as for the resource itself, I believe). I don't know what the standardization track is for those efforts, but if they're likely to be effective, we should at least mention them somewhere.

Seirdy commented 1 year ago

I see this as a bit of a false dichotomy because it assumes that websites gather telemetry for their own gain, which doesn't translate into a benefit for the user. In practice, pretty much all of the telemetry is used by websites in order to realize some clear benefits for the user

@arturjanc Whether or not increasing a user's fingerprint (potentially crossing the uniquely-identifiable threshold) is "worth it" is something for the user to decide, not a webmaster. Studies need the consent of all subjects involved, even if researchers believe that it's in the subjects' best interests. Users can make informed consent after being informed of the scope of telemetry, how it will be used, and how it will be shared.

A user (like me) who visits a website one time probably doesn't care if the website "improves their experience" if they don't intend to re-visit it. They probably wouldn't consider "collect and share information about your setup, in exchange for a better site in the future" a fair trade. From the perspective of a one-time user, the Reporting API serves only to fingerprint.

POSSE note from https://seirdy.one/notes/2022/09/04/reporting-api-and-informed-consent/

yoavweiss commented 1 year ago

@Seirdy - I think it's worthwhile distinguishing here between the information that is exposed to the web site (through the Reporting API, or through other means) and the delivery mechanism for this information.

I would claim that how a certain piece of information reaches a website doesn't change the way in which this information can be used and abused. So if we want to limit information about the user's setup, that's great but we should do that at the exposure point. Adding extra friction to the delivery mechanism (that is, the Reporting API in this case) will do nothing but cause sites to choose other delivery mechanisms (e.g. get that info from a JS API and upload it with fetch()).