raiden-network / raiden

Raiden Network
https://developer.raiden.network
Other
1.84k stars 378 forks source link

Decide on WebRTC Implementation for direct communication #6011

Closed fredo closed 4 years ago

fredo commented 4 years ago

Abstract

As discussed in the Deep Dive A short term solution to increase transaction speed and scalability could be a new layer on top of Matrix. As the light client implemented the WebRTC solution, the Python client aims to do so as well.

Specification

The webRTC direct channel is supposed to be an outsourced communication layer between the clients. As a fallback there is already a room in existence and is ready to be used in case of connection errors.

How it works

1) Create a Room with the partner 2a) Initiate Negotiation about WebRTC Channels with matrix events 2b) Start signalling to STUN Server to receive candidates for WebRTC 3) Exchange Candidates 4) Open Channel and start communicating

Implementation

Signalling

Signalling is the process of negotiation how and where the connection should be established. By signalling to a STUN Server candidates are gonna be found which can be exchanged with the corresponding partner. Fortunately, Matrix already provides a handy way for negotiation for VoIP. The signalling negotiation between the participants is already reflected in specific Matrix message types provided by the API. To read up on this click here.

STUN Server

For Signalling a STUN Server is needed. There are public STUN Servers available which can be used or a RSB provider could also set up a STUN.

RTC Protocol and communication

This is definitely the most tricky part. RTC uses a variety of Transmission protocols on different layers (ICE, DTLS, SCTP).

diagram_2_en

It seems only one useful framework to exist for python. Unfortunately, aiortc uses asyncio. Besides that the repo is very well mantained an straight forward to understand.

Our Options

So what are our options on this?

Outsource aiortc

Since aiortc relies on a asyncio event loop. One possible solution ist to outsource aiortc into its own process which uses asyncio.

Rewrite aiortc

A brief analysis on the code base brought the conclusion to me that asyncio is not only used on the top layer as well as in the building blocks of it. On the other hand, the code looks pretty understandable and is easy to read. It has about 6k lines of code. Another solution could be to "unasyncio" it and make it usable with gevent. The effort needed must be measured for this and see if it's feasable.

Use building blocks and build own top layer

I researched for the underlying building blocks and it might be a feasable solution to use existing building blocks which RTC uses and wrap them together on our own. Some libraries for that:

As for now, I think an own solution with existing building block could result in the "least" amount of effort but also adds effort to mantain in the future. To be honest, Outsourcing aiortc could be even less effort, but is not the "nicest" solution. We should come together and discuss how to move forward on this.

Additional reads

andrevmatos commented 4 years ago

Outsourcing may be done in a separate thread in the same process, as messages in & out are simple and could be pickled in a socket transparently by the threading queues as usual. Another option is to support asyncio eventloops inside our gevent core, which shouldn't be complicated: I've read aiogevent is very simple, maintainable, and I've read it still can run successfuly some big asyncio test sets, so could be a good & simpler bridge suitable to our needs. On either solution, I think getting a standardized and comprehensive way to integrate gevent & asyncio is an important step ahead, as more and more modern libraries are adopting asyncio API, and it also opens the door for matrix-nio, which could be a huge improvement to the matrix codebase.

ulope commented 4 years ago

Another option to investigate: It may be possible to run an asyncio eventloop in a separate thread (and whether this is viable from a performance perspective due to the GIL).

As for aiogevent, I don't think that's in any way usable at the moment. The last commit is 5+ years old. Back then gevent wasn't Python 3 compatible and Python didn't have the async / await syntax yet. Added to that "simply" switching out the entire underlying event loop seems like a good way to break everything in novel and interesting ways ;)

hackaugusto commented 4 years ago

There are not that many options of libraries, we decided to go with aiortc and integrate the gevent's event loop with the asyncio's event loop.

https://github.com/raiden-network/raiden/issues/6198#issuecomment-634738080