w3c / webrtc-nv-use-cases

Use cases for WebRTC NV
https://w3c.github.io/webrtc-nv-use-cases/
Other
32 stars 13 forks source link

Obtain user consent for one-way media and data use cases #58

Open lgrahl opened 6 years ago

lgrahl commented 6 years ago

This is a backport from https://github.com/w3c/webrtc-nv-use-cases/pull/14 I would like to get added to WebRTC 1.0:

The application must be able to request user consent for one-way media and data only use cases in a non-discriminating way.

Rationale

There are use cases that would love to request user consent but cannot use getUserMedia, namely unidirectional audio/video or data only use cases. For example, security cameras, baby monitors, drones, MOOCs, remote device access, easy file transfer, multiplayer games, etc. could greatly benefit from mode 1. In some cases, these may be connected via a separate interface that is not the default interface. Furthermore, for data only use cases an indirect connection can be much more restraining than for real-time audio/video data because of the potential impact on throughput.

I suspect that the mDNS extension to hide host candidates is going to be accepted by the IETF WG and shipped in 1.0 stacks (in fact, Safari already does it). That extension is going to affect use cases that do not have user consent and thus automatically handicap all use cases mentioned above.

Implementation Suggestion

Now, this is something where I expect a lot of discussions. Implementations could add a new permission request. An example:

Will you allow filetransfer.example.org to establish a direct connection? [Learn more]


I would happily provide a PR for adding this.

youennf commented 6 years ago

Prompting has a cost. First, one needs the right message to convey all the implication to the user. When sharing camera and/or microphone, the user is somehow understanding that some privacy is getting lost. The message you suggested might not enable the user to understand the implications of their choice. Second, the more you prompt, the more the prompts become meaningless as users will start clicking without caring about the choice.

As of the use cases you are mentioning above, I believe data channel use cases might be the most impacted by not exposing host candidates or exposing through mDNS only. For audio/video cases, if one of the user is giving access to camera/mic/screen, the other user does not need to provide host candidates. If a user is a native app (remote device access, drones...), it could be its own STUN server and the native app will probably provide its host candidates. For use-cases that are synthetic audio/video, it might be cheaper to send the parameters to regenerate audio/video on the other side, which then goes back to data channel only.

lgrahl commented 6 years ago

First, one needs the right message to convey all the implication to the user. When sharing camera and/or microphone, the user is somehow understanding that some privacy is getting lost. The message you suggested might not enable the user to understand the implications of their choice.

Fully agree, phrasing this as understandable as possible is crucial. The implementation suggestion is coming from a non-native English speaking person which doesn't have a great background in UX design - me. So, I'm absolutely sure it can and should be improved greatly. But the necessity remains.

Second, the more you prompt, the more the prompts become meaningless as users will start clicking without caring about the choice.

Granted, it's true that a user might do that (it's not going to be me, and I guess it's not going to be a privacy-affine person but there will be those people). However, we cannot play the bouncer and eventually dismiss new permission requests because the club is full of them. At least not without a viable alternative.

Edit: But just to be clear - I'm not suggesting that getUserMedia shouldn't provide access to mode 1 any more.

For audio/video cases, if one of the user is giving access to camera/mic/screen, the other user does not need to provide host candidates. If a user is a native app (remote device access, drones...), it could be its own STUN server and the native app will probably provide its host candidates.

I believe the use case here is that both sides want to communicate directly (think of same network IoT stuff, for example a baby monitor or a door bell and a browser on the other side). Even if the native app would provide its own STUN server (which, as a requirement I find... weird), it will still not be reachable if browser and native app cannot communicate via the page origin's route. I'm sure @steely-glint can elaborate on the use case (and correct me if I'm wrong).

Edit: I tagged the wrong person. :man_facepalming:

youennf commented 6 years ago

Oh, mode 1... MDNS might be used to expose non default route candidates for some of these cases.

lgrahl commented 6 years ago

MDNS might be used to expose non default route candidates for some of these cases.

When I initially filed Webkit bug 174500, I was really excited about the mDNS approach being used to resolve it. I was probably one of the very first to cheer this on and hoped that mDNS would be used for the exact purpose you are mentioning - hiding non-default route candidates for applications that don't have consent. But so far, it went in the opposite direction.

And while using mDNS to hide non-default route candidates would likely improve the situation, I'm convinced that we will see consent vs. no consent diverge further. Thus, the only resolution I see is that we need to allow non-getUserMedia use cases to obtain user consent to prevent further discrimination and having this reproach bubble up every time new consent-bound things are being discussed.

youennf commented 6 years ago

I think the idea is for MDNS ice candidate to prove its feasibility and usefulness in a smaller case. If it flies, I believe we will address the non default routes as well.

anderspitman commented 5 years ago

I would like to add my support to this proposal. I think it makes sense to have a more specific permission for data channels. I'm currently working on a LAN game that depends on host candidates. The networking works great between Chrome and Firefox, but the only way on iOS would be to get audio/video permissions from the user (see here and here). Obviously that's a no-go. It doesn't make any sense in the context of my game. And this is only likely to get worse as Chrome and Firefox move to mDNS, until all of them are working, at which case I imagine it will get better. But who knows how long that will take. I imagine there could be at least some period of time where only one major browser is left having not implemented it, at which time I'll be stuck in the inverse problem I have now. If we had a more data-channel specific permission, browser vendors can implement it as they add mDNS.

I agree the message to the user could be difficult to craft properly, but it would have to be pretty bad to be worse than asking them to take a selfie so they can play my game.

alvestrand commented 5 years ago

Not at all clear what the spec needs to say here. This might be a quality-of-implementation issue, at least until the time we have sufficient experience to enshrine something in a specification.

lgrahl commented 5 years ago

@alvestrand I don't think I understand what you're saying.

Since I've been assigned, I'll see if I can come up with a suggestion.

heyheyjc commented 5 years ago

I appreciate all the work here. I'd like to add my voice in support of much more sensible permission requests. The current irony is that in the name of privacy the user is being forced to give vastly broader permissions than data-channel dependent sites like mine, or broadcast-only sites, either need or want. Everyone I've supported so far has been happy to switch to Chrome or Firefox... less security/privacy-minded browsers. Hardly the ideal outcome.

youennf commented 5 years ago

@heyheyjc, can you describe your data-channel dependent site and what is precisely needed to make it work?

steely-glint commented 5 years ago

So here is a practical example I'm hitting: We allow a user to 'lend' access to a resource. This is done by the 'lender' establishing Data Channels with the device and the borrower. The borrower then sends credentials to the resource via the lender - the resource accepts them because of their provenance.

We initiate (and validate) this transaction with a QR code shown on the lender's browser. The borrower scans it and uses the info to open a Data Channel to the lender.

What we are seeing is that the Data Channel path is different if the borrower uses the native camera app as a QR code reader vs when they use a javascript one built into our web page which uses GUM and asks for permissions.

This means we have to encourage users to avoid the much better native QR code reader - which makes me sad.

heyheyjc commented 5 years ago

@heyheyjc, can you describe your data-channel dependent site and what is precisely needed to make it work?

Sure. It's a collaborative screenwriting SPA with about 25,000 weekly users. When editing a script in eg Google Drive, users form WebRTC p2p edit-groups (websocket signalling, Operational Transform based editing, and a RAFT-based leader system for conflict control). It works great, just not for same-building Safari users (or Edge users, of course, but I'll worry about that once Edge gets above 0% on caniuse.com).

As for explaining to a bunch of paranoid and non-computer-savvy writers with overly active imaginations that that little red camera icon doesn't mean I'm recording them in their underpants? Yeah, wish me luck.

The real problem with the getUserMedia hack is that the permission request doesn't remotely match the permissions required, and nor does it explain the actual risks. Example: someone doing an 'adult' stream might want to share their camera but would never want to give away their location. Since Safari doesn't inform them of that risk, what good is the restriction and permission request really? The current situation seems the worst of both worlds, not providing any genuine security or privacy benefit, while holding back legitimate uses.

So I like the idea, and only for HOST if that's possible, of a per-domain "Permission to establish direct ('peer-to-peer') connections?" plus a 'What are the risks": "This may allow the other browser to see your IP address, which potentially could be used to learn your location".

Hope this is some help.

dbrgn commented 5 years ago

And while using mDNS to hide non-default route candidates would likely improve the situation, I'm convinced that we will see consent vs. no consent diverge further.

Chrome will apply mDNS only to host candidates of applications that have not requested a getUserMedia permission: https://bugs.chromium.org/p/chromium/issues/detail?id=930339

This is great for media based applications, but confirms your point. There is no equivalent permission request for data channel based applications. Enterprise users of DC-only applications now have the choice between accepting the privacy implications of explicitly giving microphone/camera permissions to an application that does not require microphone or camera, versus accepting the privacy implications of data traffic going through a third party TURN server even though both clients are in the same network.

I'm quite convinced that an "allow this application to establish a direct connection" permission request could resolve most of these problems.

youennf commented 5 years ago

Sure. It's a collaborative screenwriting SPA with about 25,000 weekly users. When editing a script in eg Google Drive, users form WebRTC p2p edit-groups (websocket signalling, Operational Transform based editing, and a RAFT-based leader system for conflict control). It works great, just not for same-building Safari users (or Edge users, of course, but I'll worry about that once Edge gets above 0% on caniuse.com).

Does it mean that Safari is not supported at all? Do you fallback on Web Socket? What about TURN, too slow, too costly? It would be great if you are in a position to share connection success stats.

Safari Tech Preview has support for mDNS ICE candidates which should improve Safari-to-Safari connection rate.

someone doing an 'adult' stream might want to share their camera but would never want to give away their location

Chances are their location is already given fairly accurately by their public IP address that are already exposed by regular web browsing. I tend to think that host candidates are more used for tracking/fingerprinting users.

lgrahl commented 5 years ago

Chances are their location is already given fairly accurately by their public IP address that are already exposed by regular web browsing.

Let's assume that person uses a VPN in which case camera permissions have the unintended and very surprising side-effect of being much more than just a fingerprinting surface. However, I consider the misuse of getUserMedia to be a separate issue unless people generally agree that we should have only one use case neutral permission request to get access to host candidates (mode 1/2). In that case, the use case neutral one should prevail.

steely-glint commented 5 years ago

This is great for media based applications,

Just a reminder - not all media applications are bi-directional. recv only media is also impacted (baby monitors, security cameras, drones, town-hall sessions etc) All those apps potentially suffer higher costs, higher latency and poorer privacy unless they force the user to take a selfie first (or some other gratuitous GUM call).

Let's not assume everything is a SIP phone (again).

heyheyjc commented 5 years ago

Does it mean that Safari is not supported at all? Do you fallback on Web Socket? What about TURN, too slow, too costly?

No, AFAIK Safari works fine as long as they're not in the same building, though I know there are many more issues happening than I'm hearing about. My problems with TURN or Web Socket relay, are:

I may well have to offer it as an option, but I really don't want to, and as I said before everyone I've dealt with so far emails back saying "Working great now with Chrome, so no worries."

Currently for me, considering the savvy-ness of my users, the permission+scary-red-camera-icon is a bad enough situation I'm not even going to try to cope with it until it's better, it's just not worth it. My app not working right for some people in some situations is a smaller problem than ever raising people's suspicions about the trustworthiness of the whole app.

It would be great if you are in a position to share connection success stats.

I'll be gathering stats, but I was in a mad panic getting the system working in time for Google's withdrawal of the Realtime API on which it was running before, so at the moment the only thing I see is how many scripts are being actively edited. It's likely 90% are not being edited collaboratively at any given moment - in those cases the only load is a Web Socket connection for presence etc.

dontcallmedom commented 5 years ago

I'm not seeing much progress on this, and given our timeline for WebRTC 1.0, it feels unlikely it can be integrated that late in the process.

Also, it's not clear to me that this can't as easily be done via an extension spec.

lgrahl commented 5 years ago

Thanks for the reminder. I'll create a PR next week. Spec-wise, I don't expect this to be challenging.

dontcallmedom commented 5 years ago

to be clear (but not to discourage you to create the PR), the timeline to WebRTC 1.0 needs to take into account implementation support, not just inclusion in the spec.