Stats API should require additional permission / user opt-in

pes10k commented 4 years ago

The stats collected by this API enable two new privacy harms / risks. This spec should enable the main uses of WebRTC, without automatically exposing these additional risks.

a) Leaking communication / plain text

Prior work (e.g. http://www.cs.unc.edu/~fabian/papers/foniks-oak11.pdf) has shown that you can recreate the plain text content of an encrypted, dTLS encoded audio conversation, based on patterns in packet size, frequency, etc. The fine level network information exposed by this API seems to be sufficient to re-carry out this attack. If this is needed for analysis / quality control / etc use, the API should limit it to these special cases (additional permission, for example).

b) Hardware fingerprinting

decoderImplementation, the codec data point, etc reveal information about the underlying hardware beyond what's identified by getUserMedia

youennf commented 4 years ago

a) Leaking communication / plain text

This seems like useful consideration for isolated streams. For regular streams, the web page has access to the audio/text content so this should be fine.

b) Hardware fingerprinting

Depending on the actual implementation by the browser, this may or may not be an issue. Agreed guidelines would be useful here.

pes10k commented 4 years ago

Hi @youennf. Thank you for the reply. Again though, these privacy leaks need to be addressed in the functionality of the standard; its not sufficient to list them in the concerns section.

You all are the experts on this functionality; if you can't figure out how to design and implement it in a privacy preserving way, no one can ;) (and just as seriously, punting to "implementors will fix" means that either there will be divergent implementations, or, for web compat reasons, everything will get pulled to the least private, most permissive implementation).

fippo commented 4 years ago

The fine level network information exposed by this API seems to be sufficient to re-carry out this attack.

I think that is a bit too general. Lets ignore for a bit that CBR is the answer to this particular attack. Lets also ignore that you have the audio stream.

The key of fon-iks is this: The size of the encryptedpacket therefore reflects properties of the input signal getStats provides packetsSent and bytesSent. With opus we're talking about a typical frame size of 20ms or 50 packets per second. To carry ouf foniks you would need to call getStats with a resolution higher than that.

Lets try this actually. Go to one of the samples and paste the following:

const bytes = [];
let iv = setInterval(async () => {
  const sender = pc1.getSenders()[0];
  const stats = await sender.getStats();
  stats.forEach(s => {
    if (s.type === 'outbound-rtp') {
      bytes.push([s.packetsSent, s.bytesSent]);
    }
  });
}, 10);
setTimeout(() => clearInterval(iv), 2000);

If you do a bytes.map(x => x[0]) you can see that in Chrome there you don't even have enough granularity to capture a single packet. In Firefox you do. @henbos can probably comment on getStats caching in Chrome.

I didn't see any discussion in the foniks paper about the frame size/duration but I assume that accuracy (recall/precision) drops if you increase the frame size. The mitigation here might be to limit the resolution of getStats.

Note that this concern probably also applies to getSynchronizationSources which exposes the RTP timestamp and the audioLevel (typically from the ssrc-audio-level extension) and is explicitly designed for high-frequency polling.

henbos commented 4 years ago

1)

What granularity is needed to be able to tell anything more useful than "there is or isn't audio being produced right now"? getStats() gives you aggregate counters, so the best you can do is to say that in an interval between two getStats() calls your average packet size was X bytes and the average audio energy was Y. In Chrome, the minimum interval you could achieve is 50 ms due to caching. Would mandating a caching time mitigate the problem?
Unless we're talking about isolated streams, the RTCPeerConnection can only process tracks you already have access to. You can use WebAudio or you can read pixels of a canvas or other APIs, not to mention you're sending the tracks somewhere, so the other endpoint can do whatever it wants (including communicating back with the JS to tell it whatever the result of its analysis). So we there's already, directly or indirectly, access on a byte level.

Re: @fippo: getSynchronizationSources() can poll much more frequently. It will tell you audioLevel of packets (only received packets but you could do a loopback if you could indirectly say something about sent packets too). But again, why not use WebAudio?

In either case, I don't mean to make the argument "there is another API that is even worse" as an excuse for us to do something bad. My question is: Is this really a problem with these APIs or is this an objection to having granted access to tracks in the first place, which is usable with a large number of APIs?

2)

Codec capabilities and encoder/decoder implementation strings are a valid fingerprinting concern.

What would be a way to mitigate these concerns? Adding a prompt on a per-API basis fails to address how confusing a "do you want to grant access to getStats?" would be to a normal user. Would hardware and media related privacy concerns be best addressed with a prompt of larger scope?

youennf commented 4 years ago

these privacy leaks need to be addressed in the functionality of the standard

We should first check whether, in our current model, these are leaks. For audio/video/data, WebRTC assumes pages have access to the content so I do not consider them as leaks. Isolated streams is a proposal that tries to change this model. With that proposal, we should indeed consider whether stats are leaking and I believe audioLevel does indeed leak information.

In general, stats do not seem absolutely necessary for what the user intends to do. As such, I would like them to be privacy neutral and we should probably require that. With regards to decoderImplementation, I think it can be implemented in such a way that it will not provide any more fingerprinting information than say the user agent string, but might still provide a more easy way to get that information. Should we add a requirement along those lines?

I am not a big fan of gating stats on getUserMedia. Some websites provide a button to report a problem. In that workflow, I could see how a prompt to gather more information might be feasible. It doesn't seem to meet the bar so far though.

pes10k commented 4 years ago

I just wanted to thank you all for tackling this seriously, even if it seems like solutions are still being worked out. I'm happy to phase out for a little bit while you all work out a solution for getting these issues addressed, to avoid adding noise, but would also be glad to be involved if there is anything I can do to help. Please just let me know how i can be most helpful

alvestrand commented 4 years ago

For the hardware fingerprinting issue, it seems like this should be part of an overarching issue of "is the page permitted to know what hardware the user is running", and gated on a permission that isn't WebRTC-specific. This touches on UA strings, GPU API, performance API and probably many others.

henbos commented 4 years ago

Action item on me to split this up into two issues and follow up on a) and b) separately

pes10k commented 4 years ago

@alvestrand I would welcome some proposal / spec for that, and would be happy to help push it along, but (i) you probably don't want to gate the progress of this spec on that hypo-ethical permission / spec, and (ii) its still important to be as narrow as possible in most cases. A global "fingerprinting end points on" switch forces users into a no-win situation; I expect a minimal capabilities model will be better in almost all cases

youennf commented 3 years ago

@henbos, are you still working on these issues? It seems like decoderImplementation/encoderImplementation are the last remaining stats that have fingerprinting consequences and it would be good to have a resolution there.

henbos commented 3 years ago

I'm not working on this, sadly. Unassigning myself to reflect that.

henbos commented 2 years ago

We'd still like to expose power efficiency (#666) but blocked on this issue. I don't know how to move forward though, a user prompt seems too aggressive. What do we do in MediaCapabilities?

alvestrand commented 2 years ago

The fingerprinting mitigations outlined for MediaCapabilities is here: https://www.w3.org/TR/media-capabilities/#decoding-encoding-fingerprinting

Rate limiting isn't really possible; we can't tell a getStats that looks at this item from a getStats that doesn't.

youennf commented 2 years ago

@henbos proposed during the meeting the possibility to only expose this kind of fingerprinting past some user validation (for instance if getUserMedia/getDisplayMedia was called successfully on the document).

Maybe there is a way to phrase it in a generic way, something like: As this field is a fingerprinting vector, it MUST only be exposed to contexts that the user interacted with in a deep manner, for instance if https://w3c.github.io/mediacapture-main/#context-capturing-state returns true.

henbos commented 2 years ago

Because a) and b) (from issue description) are a little different and likely require different mitigations (e.g. a mitigation to "leaking communication / plain text" is related to granularity of packet counters etc whereas "hardware fingerprinting" is about which context we should be allowed to expose HW states) I split this issue up into different issues.

This issue can continue to be about "leaking communication / plain text" For HW fingerprinting I filed #675 and, because codec is exposed in multiple places, a separate issue for that so that we can sync with webrtc-pc: #674.

As this field is a fingerprinting vector, it MUST only be exposed to contexts that the user interacted with in a deep manner, for instance if https://w3c.github.io/mediacapture-main/#context-capturing-state returns true.

I like this idea, let's follow up in #675

henbos commented 2 years ago

Since this issue was split up into a bunch of different sub-issues, I figured it would make more sense to replace this by a stand-alone issue #699 to make it more concise. Referenced this issue for context, but I'm closing it in favor of that one.

w3c / webrtc-stats

Stats API should require additional permission / user opt-in #550