Closed Gsantomaggio closed 11 months ago
Attention: 34 lines
in your changes are missing coverage. Please review.
Comparison is base (
90a6752
) 92.60% compared to head (c7c66a5
) 92.94%. Report is 2 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The issue was raised by @jonnepmyra with a specific use case
With the stable implementation, we have:
| Method | Mean | Error | StdDev | Ratio | RatioSD |
|----------------------- |----------:|----------:|----------:|------:|--------:|
| AmqpStreamConsumer | 10.39 ms | 1.454 ms | 4.287 ms | 1.00 | 0.00 |
| StreamProtocolConsumer | 102.34 ms | 18.856 ms | 55.597 ms | 10.41 | 5.89 ||
107 ms
per StreamProtocol 10.39 ms
Consumer vs Amqp StreamConsumer
With this PR:
| Method | Mean | Error | StdDev | Median | Ratio | RatioSD |
|----------------------- |---------:|---------:|----------:|---------:|------:|--------:|
| AmqpStreamConsumer | 12.95 ms | 0.993 ms | 2.928 ms | 12.28 ms | 1.00 | 0.00 |
| StreamProtocolConsumer | 28.29 ms | 5.076 ms | 14.967 ms | 24.05 ms | 2.31 | 1.38 |
107 ms
per StreamProtocol 28.29 ms
Consumer vs 12.95
Amqp StreamConsumer
PoolConfiguration:
ConnectionPoolConfig = new ConnectionPoolConfig()
{
ConsumersPerConnection = 20,
ProducersPerConnection = 1,
}
@jonnepmyra FYI: By reducing the InitialCredits
to 1
you can still improve:
| Method | Mean | Error | StdDev | Ratio | RatioSD |
|----------------------- |---------:|---------:|---------:|------:|--------:|
| AmqpStreamConsumer | 10.70 ms | 1.334 ms | 3.932 ms | 1.00 | 0.00 |
| StreamProtocolConsumer | 19.60 ms | 3.127 ms | 9.219 ms | 1.95 | 1.07 |
PoolConfiguration:
ConnectionPoolConfig = new ConnectionPoolConfig()
{
ConsumersPerConnection = 20,
ProducersPerConnection = 1,
}
Consumer:
var config = new RawConsumerConfig(_streamName)
{
ClientProvidedName = "STREAM-CONSUMER",
OffsetSpec = new OffsetTypeOffset(_offset),
InitialCredits = 1,
MessageHandler = async (consumer, context, arg3) => { tcs.TrySetResult(); }
};
Great! I'll aim to test it tomorrow when I have some time. The improvements in my benchmarks seem even more significant in real-world scenarios, beyond the isolated benchmarks.
This modification should significantly enhance performance for cases involving time-traveling streams, especially when dealing with a small number of messages. In such scenarios, a substantial portion of time is typically consumed in establishing the connection.
Appreciate your contribution, @Gsantomaggio!
I've tested this PR in both our integration test suites and real-workload scenarios, and it's performing really well, especially since 15fe9dc1098953696f4974e532f089a0f715b909.
Notably, the creation of consumers has shown a significant improvement, approximately 4-5 times faster than before. This enhancement is observed as long as the connection pool is not drained of active consumers.
A note, even though it falls outside the scope of this PR. Currently, the underlying connection of the "pool" is closed when there are no active consumers, resulting in a slowdown in creating new consumers. It would be beneficial if we could maintain the underlying connection open, even in the absence of consumers. This functionality aligns with our experience with an AMQP connection that can exist without any active channels or consumers, considerably enhancing the efficiency of consumer creation from a time perspective.
Thanks for fixing this issue!
I am going to merge the PR. It requires more job to handle the metadata update but I prefer to open another PR
Preface
The RabbitMQ stream protocol supports multi-producers and multi-consumers per TCP Connection. We designed The .NET stream to have one TCP connection to make it simple.
Pull request
This PR adds the feature without impacting too much on the code. Even some change is required.
We added a new class:
ConnectionsPool
. The pool is a Dictionary withclient_id
and the connection_info.The
client_id
is an internal field to identify the connection uniquely.The
connection_info
contains info like theActiveIds
used for that connection. The values of the IDs don't matter; only the count.For example, if a connection has producer ids: 10,11,45, the
ActiveIds
is 3How does it work?
Suppose I want to have max two ids for connection:
In this case, the connection is one with two ids, see the image:
Calling
p1.close()
reduces the reference on the pool for that connection from two to oneCalling
p2.close()
reduces the reference on the pool for that connection from one to zero. At this point, the connection is closed.Inside the
ConnectionsPoolTests
, you can find the tests with comments for each use case.How to test
I am using this script where I can tune the parameters:
I used
make rabbitmq-ha-proxy
from the golang repo to get a cluster up and runningNote:
We turned off the test
TheProducerPoolShouldBeConsistentWhenAStreamIsDeleted
because it requires changingAction<MetaDataUpdate> MetadataHandler
to anEvent
to support multiple handlers. I would prefer to have it in another PR.