Open overvenus opened 7 years ago
It seems that we only use one event loop @overvenus Can we use multi threads (one thread runs one event loop)?
@siddontang multithreaded event loop with futures-mio-tokio is an area I haven't started exploring yet.
However, at this moment grpc-rust lacks a lot of simpler performance fixes, the most important are excessive memory allocations and memory copying.
As for multithread, mio does not support multiple threads access to the same event loop concurrently.
However, there is a simple workaround, SO_REUSEPORT
. We can set multiple event loops on the same port. In fact that is what tokio-proto does.
See more:
@overvenus
If it not hard to support SO_REUSEPORT
, maybe we can send a PR.
/cc @stepancheg
SO_REUSEPORT
only works in Linux kernel 3.9+.
If we can't use SO_REUSEPORT
, we can let every event loop use same listening FD and use a lock like nginx to avoid thundering herd problem.
@stepancheg - with tokio-core deprecated and latest tokio supporting multithreaded event loops are there plans to rewrite with newest tokio?
Looks like grpc-rust is still essentially singlethreaded and processing 1 message/event at a time regardless of how many threads one creates with server.http.set_cpu_pool_threads
, is that still the case or am I missing something?
Any plans here or anything we can help out with @stepancheg? We have some pretty heavy computation in our requests that absolutely need to be processed on all cores in parallel
@repi no, grpc-rust is fully concurrent.
If server.http.set_cpu_pool_threads
is specified, server callback is executed in the thread pool, which is useful for synchronous processing (e. g. for synchronous I/O). However, it should be fully concurrent even without thread pool, if input and output streams are used correctly.
I don't understand what's the issue, maybe there's a bug, I'd like to know more.
@Kane-Sendgrid I didn't look at latest tokio, AFAIU it's unreleased right?
Thanks for confirming @stepancheg . I did some more investigation and the server event processing is indeed running in parallel, great!
Did some more testing in our (early) scenario and looks like it is on the client side with grpc-rust that we are getting no parallelism. When we have a single Client::new_plain
created and do parallel calls to initiate RPCs with it (of the simple unary type) and then after issuing all calls we wait on them. It looks like in this scenario the RPCs are sent over serially to the server (also in grpc-rust) and blocking so the server only gets them 1 by 1 and can't process them in parallel from this client.
If we instead create a Client
for each thread that we are issuing RPCs from in our app, then we don't have this serial bottleneck.
How is a client object intended to handle parallel calls?
@repi again, I'm not sure I understand.
The client is also fully concurrent. The client can execute multiple concurrent requests. The client can be shared by multiple threads.
However, the client doesn't use thread-pool. It practically means that if you supply a stream to a request, that request will be pulled from the event loop and if request data supplier blocks, whole client blocks. But this is not the issue for unary calls.
Hi @stepancheg, thank your for the excellent work.
I've found that current performance is not very good (about 10x slower then go). I know it is in the TODO list, but I wonder if there are any plan for Optimizing performance.
Benchmark
Git HEAD @ 7e2ec8811440711aa38642055be4e8f01d32dad6
Based on long-tests.
Rust server and client were built under release.
Machine:
Streams are skipped because of incompletion.
Unary request: Echo
I have tweaked go client a little bit, my version runs in 40 goroutines.
Client sends 100000 echo requests.
FlameGraph
I also recorded a flame graph, hope it helps.
https://gist.github.com/overvenus/018e19ccc23555a7768e15774819f3af#file-kernel-7e2ec88-svg
Thank you! :)