w3c / beacon

Beacon
https://w3c.github.io/beacon/
Other
47 stars 22 forks source link

Privacy concern caused by unspecified delay of requests #31

Closed Rob--W closed 8 years ago

Rob--W commented 8 years ago

From https://w3c.github.io/beacon/#sec-sendBeacon-method

The user agent may delay transmission of provided data to optimize network and energy efficiency - e.g. deliver immediately if the network is active, or wait until network interface is active. However, the user agent should not delay transmission indefinitely and ensure that pending transmissions are periodically flushed even if there is no other network activity.

From https://w3c.github.io/beacon/#h-privacy

Similarly, from the privacy perspective, the resulting requests are initiated immediately when the API is called, or upon a page visibility change, which restricts the exposed information (e.g. user's IP address) to existing lifecycle events accessible to the developers. However, user agents might consider alternative methods to surface such requests to provide transparency to users.

The first quote allows user agents to delay requests (as long as the request eventually arrives, "should not delay transmission indefinitely"). This is at odds with the claim in the privacy section, which states that the requests are immediately triggered and that therefore the window of activity is too narrow for information leakage to occur.

As currently written, the following scenario is possible:

  1. Mr. Tim Foilhat takes measures to avoid leakage of IP address.
  2. Mr. Tim Foilhat visits privacy-is-for-wimps.example.com
  3. Mr. Tim Foilhat disables the network interface after loading the page.
  4. Mr. Tim Foihat gets the information that he need from the page.
  5. Mr. Tim Foilhat closes the page (which calls sendBeacon) and ensures that there are no other active script contexts (e.g. Service workers).
  6. Mr. Tim Foilhat travels to a secret location.
  7. Mr. Tim Foilhat enables networking again.
  8. The beacon is sent to the server, which now knows the IP address of Mr. Tim Foilhat together with the beacon data.

This scenario is specifically relevant to sendBeacon because it is currently the only API that can send arbitrary data to a server even after the user has all reason to believe that they completely left the website.

This class of problems can be solved my exploring possible information leaks and specifying that the pending request queue should be flushed when these scenarios occur. For instance, the above scenario can be resolved by emptying the queue of pending requests when the network interface changes and/or rejecting the sendBeacon call when the network interface is down.

igrigorik commented 8 years ago

The user agent must schedule immediate transmission of all beacon requests when the document visiblityState ([PAGE-VISIBILITY]) transitions to hidden, and must allow all such requests to run to completion without blocking other time-critical and high-priority work. ... Similarly, from the privacy perspective, the resulting requests are initiated immediately when the API is called, or upon a page visibility change, which restricts the exposed information (e.g. user's IP address) to existing lifecycle events accessible to the developers.

In your scenario, closing the page (step 5) triggers the page visibility transition, which (a) flushes any delayed requests, (b) all of which fail due to lack of network connectivity. So, steps 6-8 do not happen.

This scenario is specifically relevant to sendBeacon because it is currently the only API that can send arbitrary data to a server even after the user has all reason to believe that they completely left the website.

That's not true. sendBeacon does not queue any requests after the user has left the website -- the visibility transition explicitly handles this case.

For instance, the above scenario can be resolved by emptying the queue of pending requests when the network interface changes and/or rejecting the sendBeacon call when the network interface is down.

The latter case is what you'll get today already; sendBeacon does not provide any offline capabilities. For the former, note that the interface may change while you're on the webpage so any pair of requests can reveal this change. Further, because sendBeacon is limited to within the lifetime of the page, it doesn't expose anything new.

Rob--W commented 8 years ago

In your scenario, closing the page (step 5) triggers the page visibility transition,

Except that does not always happen in Chrome. It seems to happen in Firefox though, so perhaps this is an implementation flaw? Here is a test case that shows that visibilitychange event is not fired when a tab is closed (Chrome 50.0.2661.94):

  1. Visit chrome://net-internals#events
  2. Visit data:text/html,<script>document.addEventListener('visibilitychange',function(e){navigator.sendBeacon('https://example.com/'+document.visibilityState);});</script>
  3. Close the tab from 2.
  4. Look at the chrome://net-internals output and observe that there is no log entry for the beacon request (here I'm just using beacon requests to make sure that the result appears in the log).
  5. If you unload the page by navigating to a different page, then a request does appear in the log, which indicates that the page visibility did change.

And does the page visibility change get recorded when a tab crashes? I.e. if I add a beforeunload listener and trigger a crash of the tab (the renderer), would sendBeacon requests be flushed (when a tab is closed)?

igrigorik commented 8 years ago

Yes, this is the PV bug for Chrome: https://bugs.chromium.org/p/chromium/issues/detail?id=554834.

In the meantime, note that as of today neither Chrome or Firefox delay delivery of sendBeacons.. so the above is not an issue; the scenario you've described above is not possible with current sendBeacon implementations in FF and Chrome.

Rob--W commented 8 years ago

@igrigorik I was running Chrome with the --enable-experimental-web-platform-features flag while running the above test, so that crbug is not applicable. Step 1-2 + 5 shows that the visibilitychange event is dispatched when I unload the page by navigating to a different page (opposed to closing the tab). If I just use an unload event instead of visibilitychange, then the beacon arrives, so this is most likely a bug with implementation of page visibility.

Closing the bug since the spec is fine, if you're able to confirm the implementation bug that I reported, please post it to crbug.com.

igrigorik commented 8 years ago

@Rob--W hmm, it may have something to do with data: URI... I do see the request when testing from a dedicated page. E.g. try this: http://output.jsbin.com/zubiyid/latest/quiet?beaconUrl=http://example.com/visibility

Rob--W commented 8 years ago

@igrigorik I can reproduce the problem even without data URLs. If closing the tab causes the renderer to go away, then the visibility event is not triggered.

  1. Visit https://example.com
  2. Open the console and paste

    document.addEventListener('visibilitychange',function(e){navigator.sendBeacon('https://example.com/'+document.visibilityState);});
    document.body.innerHTML = '<a href="/" target=_blank>Same site link</a>'
  3. Open chrome://net-internals/#events in a new window (new window so that visibilitychange is not triggered unnecessarily).
  4. Switch back to the window containing example.com
  5. Click on the link (it opens a new tab in the same process).
  6. Go back to the original example.com tab.
  7. Now go to chrome://net-internals again and clear the events (top-right triangle, Reset) to remove the noise.
  8. Close the original example.com tab.
  9. Observe that the beacon event appears in chrome://net-internals.
  10. Repeat all these steps, but then without clicking on the link (so closing the tab would cause the renderer process to terminate). Then you'll see the same behavior as with data:-URL as I reported before.
igrigorik commented 8 years ago

Thanks for tracking this down. Left a note on https://bugs.chromium.org/p/chromium/issues/detail?id=554834#c5 -- let's continue there.

kinu commented 8 years ago

@Rob--W Hi Rob, I was trying to reproduce the issue using the steps summarized above, but for me 'https://example.com/hidden' beacon keeps appearing regardless of whether I keep the same-site link clicked&opened or not. (I've tested it on Linux and Mac OSX with ToT and Chrome Canary 53) Might I be missing some steps or are you still seeing the issue with ToT/Canary? Thanks!

Rob--W commented 8 years ago

@kinu I can reproduce on ArchLinux (Chromium 51.0.2704.79 and 53.0.2771.0), provided that I close the devtools after running the snippet (i.e. after step 2). It seems that the presence of the devtools delays teardown for long enough that the beacon gets through.