A refactoring of DruidQuery. Instead of specifying a type parameter and do the json decoding right away, the new approach always returns a DruidResponse which basically contains the raw data in one simple case class that works for every Druid Response (taking care of how to get this data, since groupBy results are available under event whereas TopN and Timeseries results are available under result.)
The reason to do this is because it makes it easier to run parallel Druid Queries. We could now do someting like:
val future1 = TopNQuery(...).execute
val future2 = TopNQuery(...).execute
val future3 = TimeSeriesQuery(...).execute
val total: Future[List[DruidResponse]] = Future.sequence(List(future1, future2, future3))
This way we're not bitten by type erasures which would otherwise leaves us with a Future[Any] or something similarily obscure.
Introducing Circe
Next, we introduce circe for the json decoding and encoding. Circe is supposed to be faster (according to the benchmakrs at https://github.com/circe/circe-benchmarks. Also, I noticed that Json4s out of the box marshalled all public properties in an object, not just the constructor properties. In the logs I encountered quite some fields in the json that should not belong to Druid query input.
Low level http client
This MR now uses an Akka stream connectionFlow which gives better performance. The old Http.singleRequest method regularly gave pool overflow errors (https://doc.akka.io/docs/akka-http/current/client-side/pool-overflow.html). Test run locally seem this issue occurs now way less often
This MR contains three changes:
Removing the type parameter of
DruidQuery
DruidQuery
. Instead of specifying a type parameter and do the json decoding right away, the new approach always returns aDruidResponse
which basically contains the raw data in one simple case class that works for every Druid Response (taking care of how to get this data, since groupBy results are available underevent
whereas TopN and Timeseries results are available underresult
.)The reason to do this is because it makes it easier to run parallel Druid Queries. We could now do someting like:
This way we're not bitten by type erasures which would otherwise leaves us with a
Future[Any]
or something similarily obscure.Introducing Circe Next, we introduce circe for the json decoding and encoding. Circe is supposed to be faster (according to the benchmakrs at https://github.com/circe/circe-benchmarks. Also, I noticed that Json4s out of the box marshalled all public properties in an object, not just the constructor properties. In the logs I encountered quite some fields in the json that should not belong to Druid query input.
Low level http client This MR now uses an Akka stream connectionFlow which gives better performance. The old
Http.singleRequest
method regularly gave pool overflow errors (https://doc.akka.io/docs/akka-http/current/client-side/pool-overflow.html). Test run locally seem this issue occurs now way less often