Open GuacheSuede opened 5 years ago
@boazsegev Thanks for the detailed response, i think libH2O/ H2o Server is very similar to Facil.IO in performance too.
@boazsegev would you be able to pin point the code segment that throttles ?
@GuacheSuede , the throttling code is here.
Basically, once the pipeline_limit
is reached (8 requests), facil.io will stop handling further requests and force the connection to the end of the event queue - which means that the IO reactor polling will occur before any further request handling.
P.S. (edit)
Also, the connection itself is excluded from the IO reactor during this cycle.
@boazsegev Assuming overhead was of no concern within reasonable limits, what would you improve in facilIO to improve speed ?
@GuacheSuede , that's a great question.
Performance isn't just speed. Security, memory use, energy consumption and development ease are also consideration (IMHO).
Honestly, I don't know where I would improve.
Testing with Valgrind shows that a lot of time is spent on string data management and hash map rehashing (especially when adding headers to the output and formatting the HTTP output).
This is a direct result of copying data from the HTTP parser and delaying data "writes" to the HTTP output stream...
This was a compromise between speed on one hand and memory consumption and security on the other (as well as some features that resulted from the approach). I debated about this choice for a long time, since I'm saving memory and adding flexibility at the expense of CPU.
On the other hand, I provide example code that avoids data copying (while sacrificing the resulting features and flexibility)... So people who prefer speed can easily use facil.io for speed.
My current performance wishes include:
lower energy consumption by finding a better solution for thread signaling.
Currently nanosleep
replaces signaling since benchmarks show it's faster.
However, facil.io can't use signals (since the user code might use signals) and the energy consumption for nanosleep
is slightly higher than pipe polling (which can be enabled on the development branch by compiling with -DFIO_DEFER_THROTTLE_POLL=1
and currently only works on Linux). So it seems the current approach might stay a while.
Optimized String management (see the inline String library in the fio.h
header).
Valgrind profiling shows that both String management and HashMap rehashing are a used often.
@GuacheSuede ,
P.S.
I would point out that on my list of things to do, at the moment, SSL/TLS, HTTP/2 and PostgreSQL integration have a higher priority than most performance concerns.
Thanks for the detailed reply @boazsegev. Have you considered using green threads to speed it up ?
HTTP/2 yup, I agree that is more important, may I ask what is the current status ?
@GuacheSuede
may I ask what is the current status ?
I'm a single person working on this project on my spare time. Things move along at intervals and I make no promises as to when or if any future features will be released.
In practice, HTTP/2 requires SSL/TLS. So this needs to be done before I start working on the HTTP/2 layer.
At the moment, I finished re-writing and testing the Read/Write hooks and I finished designing an initial SSL/TLS API draft.
These two layers are important, since they will allow SSL/TLS connections to use the same API as non-SSL/TLS connections, making security easier for developers.
Currently I hope to support both OpenSSL and BearSSL using buffered IO (BIO). However, I'm a little busy with other projects and probably won't get to it very soon.
Once I finish with the SSL/TLS layer I'll be able to move on to the HTTP/2 part.
Have you considered using green threads to speed it up?
Green thread implementations might be too architecture specific.
I'm trying to avoid CPU assumptions when possible.
Also, green threads add a number of concerns that facil.io won't be able to handle. For example, what if a database library uses a blocking IO function call? - in such a case, the green thread will block the whole framework while today's multi-threading approach will only block a single working thread.
Instead, facil.io offers an evented API. If the evented API is properly utilized, a single thread should perform the similarly to green threads (with the added benefit that more than a single thread can be utilized).
@GuacheSuede ,
Thank you for your question.
At the moment, I don't know.
I found the old lwan TechEmpower benchmarking code, but it seems that the framework isn't benchmarked anymore. There's a recent discussion about this, so it might be added soon.
This means that the only way to answer is to benchmark the framework independently, which I didn't do yet.
However, here are a few things to consider:
lwan uses a no-copy approach to headers, which will provide more speed for simple applications.
facil.io offers a no-copy parser (see the raw-http.c example), but the normal HTTP library copies the data to the HTTP handle. This is mainly designed for ease of use but it also helps with security and future HTTP/2 support (abstracting away protocol differences).
In this regard, facil.io prefers development ease over performance, while lwan appears to prefer performance.
lwan supports pipelining, and seems to be giving pipelined connections a performance benefit.
facil.io supports pipelining but throttles pipelined connections to prevent DoS attacks (they still perform faster than regular requests, but not much faster).
In this regard, facil.io prefers security, while lwan appears to prefer performance. Note that this greatly effects the TechEmpower "plaintext" benchmarks (that utilize pipelining).
lwan appears to have a CGI approach - the HTTP callback's return value is used to determine the HTTP response.
facil.io allows for an evented approach, pausing and resuming the HTTP handler. This allows facil.io to continuously handle tasks while the HTTP request is waiting for background tasks to complete (such as database queries, etc').
In this regard, facil.io prefers performance for complex requests over simple / "hello world" requests, probably resulting in lwan appearing faster where simple requests are concerned.
The libraries also appear to have a number of different features:
facil.io offers a multi-thread + multi-process hybrid approach. I think lwan is only multi-threaded (I'm not sure).
facil.io offers a pub/sub API and a Redis connector for horizontal scaling. I don't think lwan has these features.
lwan offers caching (facil.io doesn't provide a caching mechanism).
lwan appears to offer dynamic HTTP compression. facil.io only offers support for pre-compressed data / files.
Depending on how you use the library, this might be bad or good. For example, dynamic compression for HTTP might expose security issues in SSL/TLS based connections (see details in the HTTP/2 specification).
This is pretty much what I could understand after reading through some of the lwan website and documentation. I haven't tested it yet and didn't read the source code, so I'm not sure.
According to TechEmpower's benchmarks, facil.io currently works best on small systems (docker images, etc') and might not utilize multi-core systems as effectively as other solutions (this might be a process/thread ration issue used in the physical server benchmarks).
Please note that the latest TechEmpower benchmarks use facil.io 0.6.x. The 0.7.0.beta2 should be faster.
EDIT:
I received a comment about facil.io being ranked very low on the TechEmpower plain text benchmarks.
This is true and is caused by two main facts:
facil.io throttles pipelined requests to protect against DoS while the benchmarks use pipelined requests (as discussed here); and
facil.io's TechEmpower application is optimized for the cloud benchmarking environment rather than the physical one.
Also, facil.io ranks very high (94.8% out of a maximum of 100% performance speed) on the JSON serialization benchmark in the cloud environment for round 17.
I would also note that facil.io was one of the highest ranking frameworks on round 17 where overhead was concerned (speed isn't the only measure for performance).