Server Timing can be used a persistent 3rd party identifier

JibberJim commented 5 years ago

The content of the server-timing header is available to any javascript of the page, by design of course, however because the resource can be cached by the browser, this leads to it being usable as a shared cross site 3rd party identifier for fingerprinting a user across sites. (Like a 3rd party cookie, only without any of the controls/ad blockers that exist for these.)

Demonstration of problem

To resolve this, I'm not sure, on what the use cases of the feature on 3rd party resources, however I cannot personally see any use cases for caching server-timing, it should always be removed when the resource is served from a cache.

I also think timing-allow-origin: * is unnecessary, and given the header always needs to be generated to be useful, the server can echo origin header of the original request, this would tie any specific timing response to the individual site that made the request.

bripkens commented 5 years ago

To resolve this, I'm not sure, on what the use cases of the feature on 3rd party resources, however I cannot personally see any use cases for caching server-timing

We (Instana) and our customers have a use case for this: Being able to correlate end-user experience / problems to the actual server-side execution that generated a response. Note that this doesn't necessarily have to involve a third-party. Businesses commonly share resources/assets and expose APIs via separate origins, e.g. https://example.com hosts the website and https://api.example.com the API.

To elaborate: This is not a user fingerprinting. We generate an ID for the server-side activity and place this ID into the Server-Timing header. Each server-side activity has a unique ID.

yoavweiss commented 5 years ago

The exact same attack can be done by e.g. caching a JS file with a uid global variable in its contents. The solution for that is double-key caching.

Closing, as Server-Timing does not increase the attack surface here.

JibberJim commented 5 years ago

It increases the attack surface because the third party identifiers are now carried in a place that does not require javascript to be executed on the domain that is the 3rd party, only first party scripts are required to be executed but the tracking is carried on other non-script resources, so this attack would circumvents uBlock's 3rd party script blocker protection from your JS file approach.

In any case, the privacy and security section needs updating to note the risk.

JibberJim commented 5 years ago

We (Instana) and our customers have a use case for this: Being able to correlate end-user experience / problems to the actual server-side execution that generated a response.

Ah, yes, and the resource could be cached but you're still interested in the failure, that does make sense as a use case - that would be mitigated by requiring timing-origin: Origin header from original request would it not though? Since only the requesting domain could make the request and your service could add the allow-origin as easily as the identifier? Or requiring cache-partitioning of course.

bripkens commented 5 years ago

that would be mitigated by requiring timing-origin: Origin header from original request would it not though? Since only the requesting domain could make the request and your service could add the allow-origin as easily as the identifier? Or requiring cache-partitioning of course.

This mechanism already only works when our customers set the Timing-Allow-Origin-Header for cross-origin requests. So whether or not Timing-Allow-Origin-Header: * or Timing-Allow-Origin-Header: https://origin.example.com is set doesn't make a difference to us (and we cannot influence what our customers do here).

yoavweiss commented 5 years ago

I'm not familiar with uBlock's 3P script blocker, but the same attack can be done using nothing but CSS: A 3P CSS is delivered, providing the UID as the dimensions of a certain class, and can be read by the first party when retrieved from cache.

Also, I assumed content blockers would block the request from ever going out, preventing reading its server timing timings.

yoavweiss commented 5 years ago

/cc @mikewest @arturjanc

mikewest commented 5 years ago

It does seem to me that double-keying the cache is a reasonable mitigation for this, in the same way that it mitigates the risks posed by etag and last-modified. There's a marginal increase in visibility, insofar as the server timing API exposes the data directly, but is it different in kind than what's exposed via headers legible via CORS (which the 3rd-party would be reasonably expected to opt-into)?

yoavweiss commented 5 years ago

Thanks @mikewest for confirming! Closing this issue as this should be resolved elsewhere

w3c / server-timing

Server Timing can be used a persistent 3rd party identifier #67