OpenMined / PyGrid-deprecated---see-PySyft-

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science
Apache License 2.0
617 stars 217 forks source link

Possible approaches to implement Websocket communication #332

Closed IonesioJunior closed 4 years ago

IonesioJunior commented 5 years ago

The purpose of this issue is to discuss about possible trade-offs of Flask-SocketIO / aiortc / websockets adoption.

IonesioJunior commented 5 years ago

Flask-SocketIO:

IonesioJunior commented 5 years ago

Websockets:

IonesioJunior commented 5 years ago

@cereallarceny could you add some information about aiortc/webrtc here?

cereallarceny commented 5 years ago

I don't know anything about Aiortc, I just merely found that link a while back. I've have not investigated anything about it.

@IonesioJunior Let me say right off the bat that my vote is going to be for PyGrid to use Websockets directly (without AsyncIO or SocketIO). Here are the following reasons behind that opinion:

  1. Control, control, control... You get a lot of control with doing native Websocket solution. Since less things are handled for you inside the library, you'll gain specificity which means less code, which means things will almost certainly run faster. And, it makes it a ton easier to debug code where you control the solution from start to end. No magic = more control.

  2. Tying yourself to Flask is a bad idea. Why? Because you simply don't need a web framework in PyGrid or grid.js. It's overkill. The grid.js project hasn't ever used one because we took a "sockets first, sockets only" approach. Perhaps there's a reason we might want to allow for HTTP endpoints to exist, but I haven't found one, and they're also quite easy to implement natively without a web framework. Even if you stick with Flask internally, there's no reason to tie yourself to that specific framework. Talk about a walled garden...

  3. Socket.io is basically dead. The primary author has admitted to having only 1 day a week at most for development and a new version was promised years ago but never provided. Seems like that kinda makes the decision for itself. If you want to go with something like socket.io, you're relying on that library contuining to exist and be supported in the future. Websockets is a specification... why not just use a library that implements that specification exactly?

Now for a few grid.js specific reasons...

  1. This is exactly what grid.js uses. It's as "close to the metal" as sockets get in Python and Javascript. The primary reason we didn't use something like socket.io in grid.js and syft.js is that syft.js is a library currently only used in the browser and therefore file size is a huge issue (and socket.io is huge). For a client-side Javascript library: the bigger your bundle size, the less likely it is that other people will use your library. It will weigh down the load time on their website.

  2. It will make parity with grid.js super easy. I have no idea what kind of magic libraries like socket.io are doing underneath and therefore it's hard to predict how they'll play with libraries that aren't using it. How will syft.js, a native Websocket client, connect to PyGrid, a socket.io server? How does PySyft, a socket.io client, connect to grid.js, a Websocket server? Beats me - I'd hate to have to figure that one out.

  3. Any algorithmic problems in one library will allow solutions to be shared between them. If there's an algorithmic problem in either library, it should be easy to patch. We'll be able to more easily keep our standards similar.

As the main author of grid.js, I'm 100% biased. But I can go ahead and say with near certainty that if you're considering a switch to Websockets that it couldn't be a better time. It's going to make API parity a ton easier.

IonesioJunior commented 5 years ago

Sounds good, I like the "close to metal" approach. But I need to study about webrtc. Currently, I can't measure the amount of effort required to implement these features (authentication/persistence/...) using webrtc / aiortc.

cereallarceny commented 5 years ago

The good news is that with WebRTC you don't actually have to write any logic. You literally just need to forward WebRTC messages to the appropriate clients. In Javascript this is about 100 lines total, so you actually don't need to know how it works at all. If the messages go to the right clients, you don't need to do anything else from a Grid perspective.

If you want we can pair program this part together since I've already written it in Javascript. :)

tallalj commented 5 years ago

my concerns:

Using webrtc within browsers makes sense, since they have it baked right inside. But with respect to python I believe web sockets is the way to go and seem more logical to me.

So I guess we should get rid of flask and move towards websockets completely, since they are "excellent implementation of RFC 6455: The WebSocket Protocol and RFC 7692"

cereallarceny commented 5 years ago

@tallalj I believe you might misunderstand what I'm asking for with respect to PyGrid. I'm not asking for PyGrid to become a WebRTC client. I'm only asking for PyGrid to add the socket server methods that syft.js needs to perform WebRTC. PyGrid will not ever need to perform WebRTC, AioRTC, or any RTC variant. It only needs to relay messages. . I've documented the exact messages it needs to implement here: https://github.com/OpenMined/grid.js/blob/master/SOCKETS.md

As a socket server, you don't need to do anything special as it pertains to WebRTC. And you certainly don't need (nor would I suggest using) Aiortc. Either way, acting as a WebRTC socket server will be a requirement for PyGrid to implement if they want to reach parity with grid.js. Assuming that's still the end goal, it's not too hard to implement the socket server portion of things. You don't need to worry about what's in the messages or how to create them. You simply need to send them to the appropriate client(s).

This implementation will need to be done in either Websockets or socket.io. Hopefully, I outlined the reasons above why we don't want to use socket.io. :)

I'm also happy to clear this up on Slack, it seems I've caused a world of confusion! Sorry!

tallalj commented 5 years ago

Okay with respect to webRTC thanks to @cereallarceny for explaining in lengthy detail and typing 1k+ words:

We don't need to implement anything else with respect to webrtc in pygrid.

guess we can create a new branch and get to work with implementing signaling server as a starter.

IonesioJunior commented 4 years ago

Solved!