ReactiveX / RxJava

RxJava – Reactive Extensions for the JVM – a library for composing asynchronous and event-based programs using observable sequences for the Java VM.
Apache License 2.0
47.93k stars 7.6k forks source link

3.x: parallel performs poorly with 10+ parallelism #6931

Open akarnokd opened 4 years ago

akarnokd commented 4 years ago

For some reason, the parallel Scrabble benchmark performs poorly when the parallelism level is 10+, for example, on my i7 8700 CPU (6 cores/12 threads):

image

However, my older i7 4770K processor (4 cores/8 threads) shows no such performance degradation. Neither does the reactive-streams-commons implementation (the parent of RxJava's parallel implementation) with parallelism=12. Correction: The Rsc benchmark was pinned to 8 threads and actually shows a similar inefficiency with 10+.

akarnokd commented 4 years ago

I did a different implementation but the degradation isn't gone, just reduced:

image

With the new code organization, the performance is slightly worse at P=1 and P=6 and somewhat better at higher Ps. The others are likely within the noise limit.

image

I'm starting to think the underlying issue is that one thread simply can't drive that many rails that fast, thus the round-robin dispatching will result in a high volume of scheduling activity (also hinted by Java Flight Recorder).

akarnokd commented 4 years ago

If I implement batch-dispatching, the the scheduling overhead appears to be mostly eliminated:

image

muralik09 commented 4 years ago

you have consider lot of aspects while making parallel calls.

one request want to make 10 parallel calls means and your server supports only 12 threads, what about the second request, it will wait releasing of threads from first request.

you have check back all the 12 threads are allocated to your program.

etc...