liftbridge-io / go-liftbridge

Go client for Liftbridge. https://github.com/liftbridge-io/liftbridge
Apache License 2.0
66 stars 18 forks source link

Distribute API calls among brokers #107

Closed Jmgr closed 3 years ago

Jmgr commented 3 years ago

This PR refactors the client to use and maintain only one gRPC connection per broker.

Publications are sent on brokers depending on the stream and partition using consistent hashing. This should allow multiple publications towards to the same stream to be sent to the same broker, which should help with keeping message ordering. API calls (except subscriptions and cursors) are randomly distributed on the brokers. This doesn't use the Picker or Resolver gRPC interfaces yet.

Following this change the connection pool has been removed as it was no longer needed. The MaxConnsPerBroker and KeepAliveTime client options are not used anymore either.

Missing parts include retrying publication in case one of the brokers is not responding anymore or returns an non-critical error (like unavailable), and some comments. The naming of some functions could probably be improved, and I'm not completely satisfied by the ackReceived callback.

I am creating this PR as a draft for now. While the tests are passing I have not been able to do performance tests yet. I will update this PR when that has been done.

This fixes https://github.com/liftbridge-io/go-liftbridge/issues/99.

Jmgr commented 3 years ago

There seem to be a CI failure (a runtime panic I can't reproduce locally), will have a look.

Did a few tests this morning that have shown an issue with publishing/subscribing timeouts on high load (probably the unavailable issue mentioned earlier). Will have to fix that before I can evaluate any performance benefits.

Jmgr commented 3 years ago

There seem to be a CI failure (a runtime panic I can't reproduce locally), will have a look.

Fixed.