The problem: using separate thread for every subscription is resource heavy, reduces performance due to excess context switching, and limits number of subscriptions that can be created (it depends on the OS, but it is said that practically it is somewhere around few dozen thousands threads).
The solution: use thread pools to distribute message processing from all subscriptions to a reasonable number of threads. Also use semaphores to ensure that any given subscription messages are processed sequentially.
[x] Implementation
[x] Benchmarks
Benchmarks
I executed benchmarks/pub_sub_perf.rb in this pull request and current main sending 10 messages and receiving them in different number of subscriptions (from 1 to 30000).
As can be seen, there is no difference in performance up to 5 000 subscriptions, then performance of single-thread-per-subscription start to degrade quickly. The practical limit for number of threads on my machine is somewhere above 32 000, so I didn't do benchmarking beyond 30 000 subscriptions. However, thread pool based version works totally fine on 100 000 subscriptions, for example.
Raw data
```csv
Subscriptions,Thread pool,Individual threads
1,0.22561474199756049,0.21783994200086454
5,0.22992107199752354,0.22141029100021115
10,0.23651741900175693,0.22182863399939379
50,0.24648931400224683,0.23073389499768382
100,0.26153432600040105,0.22641186799955904
500,0.35329902800003765,0.26661403700200026
1000,0.36710690700056148,0.31920890200126451
5000,0.90028256000005058,0.96799352699963492
10000,1.4961059820016089,2.2726283129995863
20000,2.0715542360012478,6.1912920100003248
30000,3.1422558439990098,11.099830663002649
```
The problem: using separate thread for every subscription is resource heavy, reduces performance due to excess context switching, and limits number of subscriptions that can be created (it depends on the OS, but it is said that practically it is somewhere around few dozen thousands threads).
The solution: use thread pools to distribute message processing from all subscriptions to a reasonable number of threads. Also use semaphores to ensure that any given subscription messages are processed sequentially.
Benchmarks
I executed
benchmarks/pub_sub_perf.rb
in this pull request and currentmain
sending 10 messages and receiving them in different number of subscriptions (from 1 to 30000).As can be seen, there is no difference in performance up to 5 000 subscriptions, then performance of single-thread-per-subscription start to degrade quickly. The practical limit for number of threads on my machine is somewhere above 32 000, so I didn't do benchmarking beyond 30 000 subscriptions. However, thread pool based version works totally fine on 100 000 subscriptions, for example.
Raw data
```csv Subscriptions,Thread pool,Individual threads 1,0.22561474199756049,0.21783994200086454 5,0.22992107199752354,0.22141029100021115 10,0.23651741900175693,0.22182863399939379 50,0.24648931400224683,0.23073389499768382 100,0.26153432600040105,0.22641186799955904 500,0.35329902800003765,0.26661403700200026 1000,0.36710690700056148,0.31920890200126451 5000,0.90028256000005058,0.96799352699963492 10000,1.4961059820016089,2.2726283129995863 20000,2.0715542360012478,6.1912920100003248 30000,3.1422558439990098,11.099830663002649 ```