zaphoyd / websocketpp

C++ websocket client/server library
http://www.zaphoyd.com/websocketpp
Other
6.96k stars 1.97k forks source link

Is there any benchmark? #345

Open tinganho opened 10 years ago

tinganho commented 10 years ago

I wonder if there is any benchmark on websocketpp? Like how many KB each connection consumes. Here is an example benchmark http://urbanairship.com/blog/2010/08/24/c500k-in-action-at-urban-airship/.

zaphoyd commented 10 years ago

I haven't written up any detailed benchmark articles, as the numbers are still shifting as I optimize things. I do have a some rough numbers if there are specific stats you are interested in.

Regarding memory usage per connection... the core connection uses ~300-500 bytes depending on if the machine is 32 or 64 bit and how many HTTP headers were sent in the handshake. The connection read buffer defaults to 16KiB, but can be configured as low as 512 bytes at a cost of reduced CPU performance on longer messages (this is usually a good tradeoff for servers with lots of idle connections). All together, the defaults run about 17KB/connection and tuned for low memory usage can get as low as ~1KB/connection.

The permessage-deflate extension (available only in its separate experimental branch) drastically changes per connection memory, needing at least 6KB and as much as 96KB per connection depending on settings.

The above numbers refer to the core library only. They do not include the size of incoming or outgoing message buffers or any application data. The default message policy allocates exactly enough memory to store each incoming and outgoing message. This provides optimal memory usage at the cost of increased memory allocation/free rates. There may be policies in the future that keep message buffers around that use more memory but less CPU/system call time.

If you have any further questions about performance or resource usage the mailing list at websocketpp@googlegroups.com would be a great place to discuss.

tinganho commented 10 years ago

I got a warning respond from that email list that it might not exist or I lacked permission.

anyway,

I guess that when using multiple threads there is no epoll blocking as mentioned in here: http://stackoverflow.com/a/1238315/449132

zaphoyd commented 10 years ago

With respect to multiple threads when using the asio-based transport. If you mean raw I/O performance alone (like an echo server sending tons of ~0 byte messages) asio does not scale linearly with more cores due to lock contention. A second core usually makes a big difference 50-100% faster, beyond that the scaling drops off pretty quickly. Keep in mind that at this point we are talking about tens to hundreds of thousands of messages / second and often run up against OS and network tuning limits.

In practice message handlers end up doing a lot more than just immediately reflecting the input bytes and so the additional cores allow a program to maintain the same message rate with larger/more complicated messages.

The multithreaded scaling question is an active research area. I do a fair amount of profiling and when I find major bottlenecks in common workflows I try and address them. If you do any profiling of your application and identify and bottlenecks tracing back to WebSocket++ please share.