crossbario / autobahn-python

WebSocket and WAMP in Python for Twisted and asyncio
https://crossbar.io/autobahn
MIT License
2.48k stars 767 forks source link

Add the ability to multiplex multiple ApplicationSessions over a single transport #856

Open rrueth opened 7 years ago

rrueth commented 7 years ago

First of all, thank you for creating autobahn-python (and Crossbar). Autobahn-python is a great library and has been very helpful to our company. As we've been using autobahn-python, we've been investigating how to set ourselves up to be able to scale with increased usage. We believe we have some good options with being able to increase servers that should take care of most problems. Additionally, we want to make sure that we are able to fully utilize any servers that we are running.

We currently have the following setup:

Given this model, we expect the number of connections from each python web server to Crossbar to grow to the tens of thousands. Since these are all essentially connections that come from the same place and go to the same place, it feels excessive to need tens of thousands of open TCP connections to maintain the ApplicationSessions.

Thus, this issue is a request to consider adding support for a single transport to be able to support multiple ApplicationSessions.

Digging through the autobahn-python codebase, it looked like the ApplicationSession was pretty well isolated from the underlying transport. It also felt like it is possible to write a version of a transport that manages allowing multiple ApplicationSessions to join the transport.

Are there plans to add support for allowing multiple sessions over a single transport? Are there any issues or design decisions that would prohibit this request?

OOPMan commented 7 years ago

@rrueth Are your components written in Python using the Twisted framework? If so you can run multiple ApplicationSession instances in a single Container worker and they will share the connection to the router.

If you are working with asyncio components then this Containers workers do not currently support asyncio but please see https://github.com/crossbario/crossbar/issues/1107 for progress on implementing asyncio Container workers.

rrueth commented 7 years ago

@OOPMan thanks for the reference to #1107! We are using asyncio with uvloop, but I'll look more into containers to understand how they work.

OOPMan commented 7 years ago

@rrueth Well, Containers currently only support components written using Twisted but as per the link support for asyncio Containers should be coming in the future.

oberstet commented 7 years ago

Containers currently only support Twisted components, and the components of a container will also run over different TCP or Unix domain socket connections - they do not use some magic multiplexed transport.

So actually, I think what you guys want are two things:

Right?

meejah commented 7 years ago

While I mostly agree that it "feels" wrong to have 10k connections between the same two endpoints, there are also going to be issues when trying to multiplex over a single (TCP) connection for this. So, I'm curious if you've done any testing and if so what actual problems arise?

When multiplexing, you will lose the existing effort that has gone into TCP fair-queuing, backpressure and the like. For example, if one of your clients is really slow, what are the effects on the Web server and/or crossbar and the throughput and latency of the other clients?

It's not at all obvious to me that running say 10k connections over 10k TCP streams is definitely worse than 10k "multiplexed logical streams" over one TCP stream. Certainly I'd expect the separate-stream case to use more memory, but that might be quickly eaten up with buffering or bad experiences for other users. Dividing, say, 1GiB of RAM up between 10k connections gives each one a ton of memory to "waste" (around 100KiB).

OOPMan commented 7 years ago

@meejah makes some very good points and it seems that I also misunderstood the question somewhat. I'm guessing that Crossbar clustering would prove useful for alleviating concerns in this regard as well?

meejah commented 7 years ago

The most-satisfying answer would be "don't worry, crossbar can support 100k's of connections no problem". That is, implementing better overall horizontal scaling of single realms (as @oopman hints at above).

Currently, you can get some amount of this type of scaling by placing users on different realms -- each realm is operated by its own (sub-) process inside crossbar.

rrueth commented 7 years ago

Thanks for the clarification about containers and connections @oberstet. @meejah, I agree with your assessment that multiplexing sessions over a single connection does not necessarily result in a better experience than using an individual TCP connection for each session. I also agree that I believe that we can horizontally scale both our servers running crossbar and our python servers that connect to crossbar.

We have not run many tests around where our limits are yet with Crossbar (i.e. how much load can we support per python server and per Crossbar server), and we probably won't get to for another month or so. I raised this request after chatting with you, @meejah, for a bit in the IRC channel the other day to start a discussion and understand if this was a feasible request, if there was an existing solution or something in the works, and what the issues might be with tackling this request. Primarily the request comes from two areas: 1) Ensuring that we can maximize resource utilization (i.e. maximize the number of devices that can contact a single one of our python servers while minimizing the number of python and Crossbar servers that we need to run). @meejah, the answer may be that maintaining a 1:1 relationship between sessions and TCP connections is currently the best approach. 2) We've seen issues before (not with crossbar nor autobahn-python) where we ran into issues with having too many open connections between our python servers and other endpoints (e.g. zookeeper, mysql, redis, etc.). Crossbar likely would not have the exact same problem, just related.

Mostly, I wanted to understand our options and where autobahn-python is headed to make sure that we're best suited for handling increasing loads.

meejah commented 7 years ago

@rrueth we are absolutely interested in having good stories around different types of scaling. Having a single logical "crossbar router" that spans many cores and machines is definitely on the roadmap (http://crossbario.com/products/roadmap/).

When you do load-testing we'd also of course be keen to see the results :)

If you are interested in commercial support (e.g. seeing particular features implemented faster) please make contact via http://crossbario.com/company/contact/

oberstet commented 7 years ago

Mostly, I wanted to understand our options and where autobahn-python is headed to make sure that we're best suited for handling increasing loads.

Here is the short story:

  1. scaling up (making use of multiple cores) and out (making use of multiple machines) of app components: we consider this mostly solved in CB already: https://github.com/crossbario/crossbar-examples/tree/master/scaling-microservices
  2. scaling up/out the CB routing core itself: this will come (as part of Crossbar.io Fabric) in two steps: a) introducing a new proxy worker type and b) adding router-to-router messaging
  3. the first line of load-balancing (towards the proxy workers) is either coming from one or more L4 LBs or directly from DNS (round-robin/HA).

Now, the communication between proxy workers and router workers, and between different router workers will need to transport many sessions.

And here, transport many session about one underlying TCP is where I currently think this feature (many sessions over one transport) make most sense, and can be of real benefit.

However, due to the specific additional requirements for communication between proxy-router and router-router, this feature could end up not being part of AB, but CB Fabric only.

where we ran into issues with having too many open connections

Systems that spawn one OS process per incoming client connection will run into issues with large connections numbers obviously. Eg this applies to PostgreSQL. To counter that, there are things like PL/Bounder and PL/Proxy.

Crossbar.io doesn't have that issue. I have tested CB on my notebook with 200k connections. No problem. Of course if those 200k connections will do a lot of messaging, then this will be the limit (CB on a single Xeon core can route calls at roughly 50k routed calls per second).

The messaging rate it much more interesting as a limit for CB rather than connection count.

Given enough RAM and a decent NIC and kernel, you should be able to keep 1M mostly idle connections on a single box.

However, with 1M connections, and only doing WebSocket ping/pong for heartbeating at say once every 10s (eg http://crossbar.io/docs/WebSocket-Options/#production-settings), this already means 100k pings sent and 100k pongs received per second.

Anyways, hope those lines shed some light on the topic ..