ing-bank / scruid

Scala + Druid: Scruid. A library that allows you to compose queries in Scala, and parse the result back into typesafe classes.
Apache License 2.0
115 stars 29 forks source link

Proposal for client improvements #61

Closed anskarl closed 5 years ago

anskarl commented 5 years ago

The current implementation of DruidClient (Druid v2.1.1) uses the connection-level client-side API (outgoingConnectionHttp and outgoingConnectionHttps). There are several limitations with the current implementation, most important ones are described below:

I would like to propose an internal API change and an additional implementation of client with advanced features:

  1. The API changes affect only the internals of the library and will not change the API usage of projects using Scruid. The only changes that are visible to other projects are related with the configuration file (application.conf) and can be migrated easily.
  2. Implementation of an additional client, named as DruidAdvancedHttpClient, to address the aforementioned issues.
  3. The original implementation of the Druid client will remain in the project and can be selected from the configuration (perhaps remain as the default client option).

Main features of the Advanced Http client

  1. Uses Akka HTTP to manage cached connection-pool to Druid host, therefore we can control the number of active connections to Druid, as well as to reuse the connections.
  2. Uses a queue to issue requests to the connection-pool and receive responses when they are available (based on Akka HTTP - Using the host-level API with a queue). By having a queue in front of a cached connection-pool, we can decide what to do when the queue overflows (Backpressure, Fail, DropHead, etc., for details see OverflowStrategy in Akka documentation)
  3. Can connect to more than one Druid Broker nodes and load balance the requests. In that case, the client creates a separate cached connection-pool per Broker and adds a Balancer in front of them.

Proposed changes

Tests

To demonstrate and test the new implementation I also added the following:

The testing environment can be setup as below:

docker-compose run --rm wait_for_druid && docker-compose up -d chaos_proxies

It is a work in progress, but all the proposed features are complete (I would like to write docs and do some code cleanups). The complete implementation of can be found in https://github.com/anskarl/scruid/commit/6685179dbc9b197cb329b593ef8663a13604b15d

I am looking forward to your comments :)

krisgeus commented 5 years ago

Thanks for the efforts. Great to have support for HA Druid added. Awaiting the PR so we can review the code changes

Fokko commented 5 years ago

Thanks @anskarl

Looks very good and thank you for the extensive proposal. I'm also fine with merging DruidHttpClient and DruidAdvancedHttpClient. This will make maintenance easier, the only issue here will be to come up with some general purpose configuration.

anskarl commented 5 years ago

Compared to the initial POC, I have performed the following changes:

I will make a PR to review the changes.

I am not sure if it is ideal to keep only one client (at least for the next release). Regarding the configuration, I think that it may need some additional changes. For example hosts parameter may move into client configuration and therefore only the DruidAdvancedHttpClient will accept multiple hosts, while the current one will accept only a single host parameter.

anskarl commented 5 years ago

I am closing the issue, since the corresponding PR has been merged.