apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.52k stars 3.71k forks source link

Druid 0.11.0 release notes #4876

Closed jon-wei closed 6 years ago

jon-wei commented 7 years ago

DRAFT

Druid 0.11.0 contains over a hundred performance improvements, stability improvements, and bug fixes from almost 40 contributors. This release adds two major security features, TLS support and extension points for authentication and authorization.

Other major new features include:

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.11.0

Documentation for this release is at: http://druid.io/docs/0.11.0/

Highlights

TLS support

Druid now supports TLS, enabling encrypted client and inter-node communications. Please see http://druid.io/docs/0.11.0-rc2/operations/tls-support.html for details on configuration and related extensions.

Added by @pjain1 in https://github.com/druid-io/druid/pull/4270.

Authentication/authorization extension points

Extension points for authenticating and authorizing requests have been added to Druid. Please see http://druid.io/docs/0.11.0-rc2/configuration/auth.html for information on configuration and extension implementation.

The existing Kerberos authentication extension has been updated to implement the new Authenticator interface, please see the "Kerberos configuration changes" section under "Updating from 0.10.1 and earlier" for more information if you are using the Kerberos extension.

Added by @jon-wei in https://github.com/druid-io/druid/pull/4271

Double columns support

Druid now supports Double type aggregator columns. Please see http://druid.io/docs/0.11.0-rc1/querying/aggregations.html for documentation on the new Double aggregators.

Added by @b-slim in https://github.com/druid-io/druid/pull/4491.

cachingCost Balancer Strategy

Users upgrading to 0.11.0 are encouraged to try the new cachingCost segment balancing strategy on their coordinators. This strategy offers large performance improvements over the existing cost balancer strategy, and it is planned to become the default strategy in the release following 0.11.0.

This strategy can be selected by setting the following property on coordinators:

druid.coordinator.balancer.strategy=cachingCost

Added by @dgolitsyn in https://github.com/druid-io/druid/pull/4731

jq expression support in JSON parser

Druid's JSON input parser now supports jq expressions using jackson-jq, enabling more input transforms before ingestion. Please see http://druid.io/docs/0.11.0-rc2/ingestion/flatten-json.html for more details.

Added by @knoguchi in https://github.com/druid-io/druid/pull/4171.

Redis cache extension

A new cache implementation using Redis has been added in an extension, added by @QiuMM in https://github.com/druid-io/druid/pull/4615. Please refer to the preceding pull request for more details.

GroupBy performance improvements

Several new performance optimizations have been added to the GroupBy query by @jihoonson in the following PRs:

https://github.com/druid-io/druid/pull/4660 Parallel sort for ConcurrentGrouper https://github.com/druid-io/druid/pull/4576 Array-based aggregation for groupBy query https://github.com/druid-io/druid/pull/4668 Add IntGrouper to avoid unnecessary boxing/unboxing in array-based aggregation

PR #4660 offers a general improvement by parallelizing partial result sorting, while PR #4576 and #4668 offer significant improvements when grouping on a single String column.

SQL improvements

Various improvements and features have been added to Druid SQL, by @gianm in the following PRs:

https://github.com/druid-io/druid/pull/4750 - TRIM support https://github.com/druid-io/druid/pull/4720 - Rounding for count distinct https://github.com/druid-io/druid/pull/4561 - Metrics for SQL queries https://github.com/druid-io/druid/pull/4360 - SQL expressions support

And much more!

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.11.0

Updating from 0.10.1 and earlier

Please see below for changes between 0.10.1 and 0.11.0 that you should be aware of before upgrading. If you're updating from an earlier version than 0.10.1, please see release notes of the relevant intermediate versions for additional notes.

Upgrading coordinators and overlords

The following patch changes the way coordinator->overlord redirects are handled: https://github.com/druid-io/druid/pull/5037

The overlord leader election algorithm has changed in 0.11.0: https://github.com/druid-io/druid/pull/4699.

As a result of the two patches above, special care is needed when upgrading Coordinator or Overlord to 0.11.0. All coordinators and overlords must be shut down and upgraded together.

For example, to upgrade Coordinators, you would shutdown all coordinators, upgrade them to 0.11.0 and then start them. Overlords should be upgraded in a similar way.

During the upgrade process, there must not be any time period where a non-0.11.0 coordinator or overlord is running simultaneously with an 0.11.0 coordinator or overlord.

Note that at least one overlord should be brought up as quickly as possible after shutting them all down so that peons, tranquility etc continue to work after some retries.

Also note that the druid.zk.paths.indexer.leaderLatchPath property is no longer used now.

Service name changes

In earlier versions of Druid, / characters in service names defined by druid.service would be replaced by : characters because these service names were used in Zookeeper paths. Druid 0.11.0 no longer performs these character replacements.

Example:1 - if the old configuration had a broker with service name test/broker: druid.service=test/broker

and a Router was configured assuming that / will be replaced with : in the broker service name, druid.router.tierToBrokerMap={"hot":"test:broker","_default_tier":"test:broker"}

the Router configuration should be updated to remove that assumption: druid.router.tierToBrokerMap={"hot":"test/broker","_default_tier":"test/broker"}

Example:2 - If the old configuration had overlord with service Name test/overlord then value of druid.coordinator.asOverlord.overlordService or druid.selectors.indexing.serviceName should be test/overlord and not test:overlord

Example:3 - If the old configuration had overlord with service Name test:overlord then value of druid.coordinator.asOverlord.overlordService or druid.selectors.indexing.serviceName should be test:overlord and not test/overlord

Following service name-related configurations are also affected and should be updated to exactly match the value of druid.service property on other node being discovered.

druid.coordinator.asOverlord.overlordService druid.selectors.coordinator.serviceName druid.selectors.indexing.serviceName druid.router.defaultBrokerServiceName druid.router.coordinatorServiceName druid.router.tierToBrokerMap

Please see https://github.com/druid-io/druid/issues/4992 for more details.

Kerberos configuration changes

The Kerberos authentication configuration format has changed as a result of the new interfaces introduced by #4271. Please refer to http://druid.io/docs/0.11.0-rc2/development/extensions-core/druid-kerberos.html for the new configuration properties.

Users can point the Kerberos authenticator's authorizerName to an instance of an "allowAll" authorizer to replicate the pre-0.11.0 behavior of a cluster using Kerberos authentication with no authorization.

Lookups API path changes

The paths for the lookups configuration API have changed due to #5058.

Configuration paths that had the form /druid/coordinator/v1/lookups now have the form /druid/coordinator/v1/lookups/config.

Please see http://druid.io/docs/0.11.0-rc2/querying/lookups.html for the current API.

Migrating to Double columns

Prior to 0.11.0, the Double* aggregators would store column values on disk as Float while performing aggregations using Double representations.

PR #4491 allows the Double aggregators to store column values on disk as Doubles. Due to concerns related to rolling updates and version downgrades, this behavior is disabled by default and Druid will continue to store Double aggregators on disk as floats.

To enable Double column storage, set the following property in the common runtime properties:

druid.indexing.doubleStorage=double

Users should not set this property during an initial rolling upgrade to 0.11.0, as any nodes running pre-0.11.0 Druid will not be able to handle Double columns created during the upgrade period. Users will also need to reindex any segments with Double columns if downgrading from 0.11.0 to an older version. Please see #4944 and #4605 for more information.

Scan query changes

The Scan query has been moved from extensions-contrib to core Druid. As part of this migration: https://github.com/druid-io/druid/pull/4751, the scan query's handling of the time column has changed.

The time column is now is returned as "__time" rather than "timestamp", it is no longer included if you do not specifically ask for it in your "columns", and it is returned as a long rather than a string.

Users can revert the Scan query's time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension.

Extension Interface Changes

Aggregator double column support

The Aggregator interface has gained a getDouble() method, which defaults to casting the result of getFloat(). The getDouble() method should be re-implemented for any custom aggregators that can support doubles.

See https://github.com/druid-io/druid/pull/4595 for more details.

QueryRunner interface change

The QueryRunner interface has changed and the old run() method has been removed, replaced by a new method that accepts a QueryPlus object.

Custom query extensions will need to implement the new interface.

Please see https://github.com/druid-io/druid/pull/4184 and https://github.com/druid-io/druid/pull/4482 for more details.

Filter interface change

The Filter.getBitmapResult() method no longer has a default implementation: https://github.com/druid-io/druid/pull/4481

Custom filter extensions will need to provide an implementation for getBitmapResult() now.

Other Notes

jvm/gc/time metric

The jvm/gc/time metric is no longer emitted, replaced by a new metric named jvm/gc/cpu for the reasons described here: https://github.com/druid-io/druid/pull/4480

Default worker select strategy

Please note that the default worker select strategy has changed from fillCapacity to equalDistribution. This change was introduced in 0.10.1, the previous release, but was not mentioned in the 0.10.1 release notes, so it is called out again here.

V8 segment creation removed

Druid will now always build V9 segments, creating V8 segments is no longer supported and the buildV9Directly property for ingestion tasks has been removed.

Please see https://github.com/druid-io/druid/pull/4420 for more details.

LogLevelAdjuster removed

Please note that the LogLevelAdjuster has been removed: https://github.com/druid-io/druid/pull/4236

Any user using mbeans to configure log levels should configure log4j2 using jmx instead.

Credits

Thanks to everyone who contributed to this release!

@a2l007 @akashdw @andy256 @asifmansoora @b-slim @benvogan @blugowski @chrisgavin @dclim @dgolitsyn @drcrallen @egor-ryashin @erikdubbelboer @Fokko @fuji-151a @gaodayue @gianm @ginoledesma @himanshug @hzy001 @jihoonson @jon-wei @kevinconaway @knoguchi @leiwangx @leventov @michalmisiewicz @niketh @pjain1 @praveev @QiuMM @scan-the-automator @solimant @SpotXPeterCunningham @tkyaw @wywlds @xanec @yuusaku-t @zhangxinyu1

gianm commented 7 years ago

Some suggestions:

leventov commented 7 years ago

Suggested to mention the new balancer strategy as druid.coordinator.balancer.strategy=cachingCost, because otherwise it's not very easy to find in docs how to configure it

gianm commented 7 years ago

Btw, for "Authentication/authorization extension points" I think it'd be good to add language making it clear that this version does include one implementation (Kerberos) with a link to instructions on how to use that one.

Igosuki commented 7 years ago

Hey guys, for us one of the biggest niceties from this version is the upgraded core extensions such as this one https://github.com/druid-io/druid/pull/4832 I think it's worth mentioning since many people actually have nested schemas.

SpotXPeterCunningham commented 7 years ago

I second @Igosuki. The nested Avro support is excellent for us and performs really well, we would be stuck without it. Will it make it make it into 11.0?

gianm commented 7 years ago

I think #4832 didn't make it to 0.11.0, but will be in 0.11.1 (or whatever the version after 0.11.0 is).

l15k4 commented 7 years ago

If anyone is interested in Prometheus integration, it will be possible with Graphite Emitter https://github.com/druid-io/druid/pull/4265 plain text protocol support.

Then changing port from graphite 2004 to 9109 of https://github.com/prometheus/graphite_exporter which makes it available for prometheus...

himanshug commented 7 years ago

@jon-wei @leventov

I see mention of removal of druid.segmentCache.numLoadingThreads config . It was removed because it wasn't ever utilized and batch load/drops were not supported. However this config is brought back in 0.11.1 with https://github.com/druid-io/druid/pull/4966 that introduces batch support for http based segment management.

I think we can omit any information related to this config in 0.11.0 release notes as users aren't required to take any action. We can put some notes about this config in next release and its intended behavior as documented in https://github.com/druid-io/druid/pull/4997 .

jon-wei commented 7 years ago

@himanshug thanks, I removed that section