Production ready negotiation tools

jmc-crash commented 5 years ago

I have successfully run through the Unity Integration Tutorial and I just have a question about a comment in the documentation.

There is a caution stating that the node-dss solution used to connect two remote instances is not safe and not to be used in production. This makes sense! My question is, do you have any toolchain suggestions for a replacement?

Also, is the video and audio data sent using TCP or UDP protocol? I understand the messages are sent to the server via http.

djee-ms commented 5 years ago

We do not currently have any solution to offer that works out of the box and offers production-level guarantees. There is some work scheduled for v1.1 to improve things around scheduling, but this is likely going to be again some non-production developer tools, albeit easier and more robust. We would like eventually to offer some local and remote service, but these things take time to build.

In the meantime, devs generally either have their own existing stack, or build one for their specific application. For example you can use a simple TCP connection that you manage. Or a WebSocket server if you want to dialog with a browser. It all depends on your application and your requirements.

jmc-crash commented 5 years ago

I see, thanks for your answer.

Is the problem more about easy negotiation of the connection (actually establishing two peers), or the subsequent transmission (encrypting and sending via UDP)?

My specific application wants to connect HoloLens users to remote users on a desktop (could be using a standalone app or a web broswer). It requires a server to manage messages and establish connections and separately stream between connections doesn't it?

djee-ms commented 5 years ago

Sorry I forgot the second part : audio and video are sent via SRTP (secure RTP), which is neither UDP nor TCP, but is somewhat similar to UDP in essence. And data channels use SCTP, also different from UDP and TCP. You never care about those; everything is automatic.

The signaling server is only there to establish the connection, and then to re-negotiate a session when some tracks are added/removed. It is used by WebRTC to be able to negotiate the audio/video/data transport automatically. But it needs to do so securely because as I understand some of the encryption settings for the audio/video/data are transiting through the signaling server, so a man-in-the-middle attack on the signaling solution would corrupt your entire stack, including compromising the security of the SRTP and SCTP pipes. So you need for example a TCP+TLS connection. Unless you do not care about encryption, as both peers are already on a secure network, and a simple TCP connection is enough for you. WebSocket is also a popular option for communicating with one peer in a browser.

xwipeoutx commented 5 years ago

Do you have any guidance on production level signalers, then? Something that plays nice with standard setups (OIDC/OAuth2, deployable to Azure etc.)? Are there existing .NET libraries or anything that can add this functionality to an existing App Service?

Ideally I'd like to be able to just stand up a signalling service on Azure, much like I stand up a spatial anchor service, but it sounds like that's a ways off. :)

djee-ms commented 5 years ago

I would like nothing more than point you to an existing production solution on Azure, but unfortunately we do not have one at the moment.

You can have a look at https://github.com/vladkol/SignalNow which was developed by a Microsoft engineer (@vladkol), but is not endorsed officially by Microsoft as I understand, nor offers any production-level guarantee I think. But at least it should be simple and deploy on Azure. @valdkol can probably help if you have questions/issues (I didn't try it myself yet).

microsoft / MixedReality-WebRTC

Production ready negotiation tools #115