Closed Jmgr closed 3 years ago
There seem to be a CI failure (a runtime panic I can't reproduce locally), will have a look.
Did a few tests this morning that have shown an issue with publishing/subscribing timeouts on high load (probably the unavailable
issue mentioned earlier). Will have to fix that before I can evaluate any performance benefits.
There seem to be a CI failure (a runtime panic I can't reproduce locally), will have a look.
Fixed.
This PR refactors the client to use and maintain only one gRPC connection per broker.
Publications are sent on brokers depending on the stream and partition using consistent hashing. This should allow multiple publications towards to the same stream to be sent to the same broker, which should help with keeping message ordering. API calls (except subscriptions and cursors) are randomly distributed on the brokers. This doesn't use the
Picker
orResolver
gRPC interfaces yet.Following this change the connection pool has been removed as it was no longer needed. The
MaxConnsPerBroker
andKeepAliveTime
client options are not used anymore either.Missing parts include retrying publication in case one of the brokers is not responding anymore or returns an non-critical error (like
unavailable
), and some comments. The naming of some functions could probably be improved, and I'm not completely satisfied by theackReceived
callback.I am creating this PR as a draft for now. While the tests are passing I have not been able to do performance tests yet. I will update this PR when that has been done.
This fixes https://github.com/liftbridge-io/go-liftbridge/issues/99.