apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.46k stars 3.7k forks source link

0.15.0-incubating release notes #7854

Closed jihoonson closed 5 years ago

jihoonson commented 5 years ago

Apache Druid 0.15.0-incubating contains over 250 new features, performance/stability/documentation improvements, and bug fixes from 39 contributors. Major new features and improvements include:

The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+is%3Aclosed+milestone%3A0.15.0

Documentation for this release is at: http://druid.apache.org/docs/0.15.0-incubating/

Highlights

New Data Loader UI (Batch indexing part)

0 15 0-data-loader

Druid has a new Data Loader UI which is integrated with the Druid Console. The new Data Loader UI shows some sampled data to easily verify the ingestion spec and generates the final ingestion spec automatically. The users are expected to easily issue batch index tasks instead of writing a JSON spec by themselves.

Added by @vogievetsky and @dclim in https://github.com/apache/incubator-druid/pull/7572 and https://github.com/apache/incubator-druid/pull/7531, respectively.

Support Kafka Transactional Topics

The Kafka indexing service now supports Kafka Transactional Topics.

Please note that only Kafka 0.11.0 or later versions are supported after this change.

Added by @surekhasaharan in https://github.com/apache/incubator-druid/pull/6496.

New Moving Average Query

A new query type was introduced to compute moving average.

Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/moving-average-query.html for more details.

Added by @yurmix in https://github.com/apache/incubator-druid/pull/6430.

Time Ordering for Scan Query

The Scan query type now supports time ordering. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/scan-query.html#time-ordering for more details.

Added by @justinborromeo in https://github.com/apache/incubator-druid/pull/7133.

New Moments Sketch Aggregator

The Moments Sketch is a new sketch type for approximate quantile computation. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/momentsketch-quantiles.html for more details.

Added by @edgan8 in https://github.com/apache/incubator-druid/pull/6581.

SQL enhancements

Druid community has been striving to enhance SQL support and now it's no longer experimental.

New SQL functions

Autocomplete in Druid Console

0 15 0-autocomplete

Druid Console now supports autocomplete for SQL.

Added by @shuqi7 in https://github.com/apache/incubator-druid/pull/7244.

Time-ordered scan support for SQL

Druid SQL supports time-ordered scan query.

Added by @justinborromeo in https://github.com/apache/incubator-druid/pull/7373.

Lookups view added to the web console

0 15 0-lookup-view

You can now configure your lookups from the web console directly.

Added by @shuqi7 in https://github.com/apache/incubator-druid/pull/7259.

Misc web console improvements

"NoSQL" mode : https://github.com/apache/incubator-druid/pull/7493 [@shuqi7]

The web console now has a backup mode that allows it to function as best as it can if DruidSQL is disabled or unavailable.

Added compaction configuration dialog : https://github.com/apache/incubator-druid/pull/7242 [@shuqi7]

You can now configure the auto compaction settings for a data source from the Datasource view.

Auto wrap query with limit : https://github.com/apache/incubator-druid/pull/7449 [@vogievetsky]

0 15 0-misc

The console query view will now (by default) wrap DruidSQL queries with a SELECT * FROM (...) LIMIT 1000 allowing you to enter queries like SELECT * FROM your_table without worrying about the impact to the cluster. You can still send 'raw' queries by selecting the option from the ... menu.

SQL explain query : https://github.com/apache/incubator-druid/pull/7402 [@shuqi7]

You can now click on the ... menu in the query view to get an explanation of the DruidSQL query.

Surface is_overshadowed as a column in the segments table https://github.com/apache/incubator-druid/pull/7555 , https://github.com/apache/incubator-druid/pull/7425 [@shuqi7][@surekhasaharan]

is_overshadowed column represents that this segment is overshadowed by any published segments. It can be useful to see what segments should be loaded by historicals. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/sql.html for more details.

Improved status UI for actions on tasks, supervisors, and datasources : https://github.com/apache/incubator-druid/pull/7528 [shuqi7]

This PR condenses the actions list into a tidy menu and lets you see the detailed status for supervisors and tasks. New actions for datasources around loading and dropping data by interval has also been added.

Light Lookup Module for Routers

Light lookup module was introduced for Routers and they now need only minimum amount of memory. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html#router for basic memory tuning.

Added by @clintropolis in https://github.com/apache/incubator-druid/pull/7222.

Core ORC extension

ORC extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the ORC extension in an earlier version of Druid.

Added by @clintropolis in https://github.com/apache/incubator-druid/pull/7138.

Core GCP extension

GCP extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the GCP extension in an earlier version of Druid.

Added by @drcrallen in https://github.com/apache/incubator-druid/pull/6953.

Document Improvements

Single-machine deployment example configurations and scripts

Several configurations and scripts were added for easy single machine setup. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/single-server.html for details.

Added by @jon-wei in https://github.com/apache/incubator-druid/pull/7590.

Tool for migrating from local deep storage/Derby metadata

A new tool was added for easy migration from single machine to a cluster environment. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/deep-storage-migration.html for details.

Added by @jon-wei in https://github.com/apache/incubator-druid/pull/7598.

Document for basic tuning guide

Documents for basic tuning guide was added. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html for details.

Added by @jon-wei in https://github.com/apache/incubator-druid/pull/7629.

Security Improvement

The Druid system table now requires only mandatory permissions instead of the read permission for the whole sys database. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/druid-basic-security.html for details.

Added by @jon-wei in https://github.com/apache/incubator-druid/pull/7579.

Deprecated/removed

Drop support for automatic segment merge

The automatic segment merge by the coordinator is not supported anymore. Please use auto compaction instead.

Added by @jihoonson in #6883.

Drop support for insert-segment-to-db tool

In Druid 0.14.x or earlier, Druid stores segment metadata (descriptor.json file) in deep storage in addition to metadata store. This behavior has changed in 0.15.0 and it doesn't store segment metadata file in deep storage anymore. As a result, insert-segment-to-db tool is no longer supported as well since it works based on descriptor.json files in deep storage. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/insert-segment-db.html for details.

Please note that kill task will fail if you're using HDFS as deep storage and descriptor.json file is missing in 0.14.x or earlier versions.

Added by @jihoonson in https://github.com/apache/incubator-druid/pull/6911.

Removed "useFallback" configuration for SQL

This option was removed since it generates unscalable query plans and doesn't work with some SQL functions.

Added by @gianm in https://github.com/apache/incubator-druid/pull/7567.

Removed a public API in CompressionUtils for extension developers

public static void gunzip(File pulledFile, File outDir) was removed in https://github.com/apache/incubator-druid/pull/6908 by @clintropolis.

Other behavior changes

Coordinator await initialization before finishing startup

A new configuration (druid.coordinator.segment.awaitInitializationOnStart) was added to make Coordinator wait for segment view initialization. This option is enabled by default.

Added by @QiuMM in https://github.com/apache/incubator-druid/pull/6847.

Coordinator API behavior change

The coordinator periodically polls segment metadata information from metadata store and caches them in memory. In Druid 0.14.x or earlier, removing segments via coordinator APIs (/druid/coordinator/v1/datasources/{dataSourceName} and /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}) immediately updates the segment cache in memory as well as metadata store. But this behavior has changed in 0.15.0 and the cache is updated per poll rather than being updated immediately on removal. The below APIs can return removed segments via the above API calls until the cache is updated in the next poll.

The below metrics can also contain removed segments via the above API calls until the cache is updated in the next poll.

This behavior was changed in https://github.com/apache/incubator-druid/pull/7595 by @surekhasaharan.

Listing Lookup API change

The /druid/coordinator/v1/lookups/config API now returns a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration.

Added by @clintropolis in https://github.com/apache/incubator-druid/pull/7647.

Zookeeper loss

With a new configuration (druid.zk.service.terminateDruidProcessOnConnectFail), Druid processes can terminate itself on disconnection to ZooKeeper.

Added by @michael-trelinski in https://github.com/apache/incubator-druid/pull/6740.

Updating from 0.14.0-incubating and earlier

Minimum compatible Kafka version change for Kafka Indexing Service

Kafka 0.11.x or later versions are only supported after https://github.com/apache/incubator-druid/pull/6496. Please consider updating Kafka version if you're using an older one.

ORC extension changes

The ORC extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions-contrib directory does not have any older versions of druid-orc-extensions extension.

Additionally, even though the new core extension can index any data the old contrib extension could, the JSON spec for the ingestion task is incompatible, and will need modified to work with the newer core extension.

To migrate to 0.15.0-incubating:

For more details and examples, please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/orc.html.

GCP extension changes

The GCP extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions-contrib directory does not have any older versions of the druid-google-extensions extension.

Dropped auto segment merge

The coordinator configuration for auto segment merge (druid.coordinator.merge.on) is not supported anymore. Please use auto compaction instead.

Removed segment.json metadata file in deep storage

The segment metadata file (segment.json) is not stored in deep storage any more. If you are using HDFS as your deep storage and need to roll back to 0.14.x or earlier, then please consider that the kill task could fail because of the missing segment.json files.

Credits

Thanks to everyone who contributed to this release!

@a2l007 @asdf2014 @capistrant @clintropolis @dampcake @dclim @donbowman @drcrallen @Dylan1312 @edgan8 @es1220 @esevastyanov @FaxianZhao @fjy @gianm @glasser @hpandeycodeit @jihoonson @jon-wei @jorbay-au @justinborromeo @kamaci @KazuhitoT @leventov @lxqfy @michael-trelinski @peferron @puneetjaiswal @QiuMM @richardstartin @samarthjain @scrawfor @shuqi7 @surekhasaharan @venkatramanp @vogievetsky @xueyumusic @xvrl @yurmix

jason-heo commented 5 years ago

@jihoonson

Hello. I've found an invalid link in Drop support for insert-segment-to-db tool section.

http://druid.apache.org/docs/0.15.0-incubating/operations/insert-segment-db.html must be http://druid.apache.org/docs/0.15.0-incubating/operations/insert-segment-to-db.html