apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.29k stars 3.66k forks source link

[DRAFT] 30.0.0 release notes #16505

Closed adarshsanjeev closed 2 weeks ago

adarshsanjeev commented 1 month ago

Apache Druid 30.0.0 contains over 407 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 50 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the upgrade notes and incompatible changes before you upgrade to Druid 30.0.0. If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

# Upcoming removals

As part of the continued improvements to Druid, we are deprecating certain features and behaviors in favor of newer iterations that offer more robust features and are more aligned with standard ANSI SQL. Many of these new features have been the default for new deployments for several releases.

The following features are deprecated, and we currently plan to remove support in Druid 32.0.0:

# Important features, changes, and deprecations

This section contains important information about new and existing features.

# Concurrent append and replace improvements

Streaming ingestion supervisors now support concurrent append, that is streaming tasks can run concurrently with a replace task (compaction or re-indexing) if it also happens to be using concurrent locks. Set the context parameter useConcurrentLocks to true to enable concurrent append.

Once you update the supervisor to have "useConcurrentLocks": true, the transition to concurrent append happens seamlessly without causing any ingestion lag or task failures.

#16369

Druid now performs active cleanup of stale pending segments by tracking the set of tasks using such pending segments. This allows concurrent append and replace to upgrade only a minimal set of pending segments and thus improve performance and eliminate errors. Additionally, it helps in reducing load on the metadata store.

#16144

# Grouping on complex columns

Druid now supports grouping on complex columns and nested arrays. This means that both native queries and the MSQ task engine can group on complex columns and nested arrays while returning results.

Additionally, the MSQ task engine can roll up and sort on the supported complex columns, such as JSON columns, during ingestion.

#16068 #16322

# Removed ZooKeeper-based segment loading

ZooKeeper-based segment loading is being removed due to known issues. It has been deprecated for several releases. Recent improvements to the Druid Coordinator have significantly enhanced performance with HTTP-based segment loading.

#15705

# Improved groupBy queries

Before Druid pushes realtime segments to deep storage, the segments consist of spill files. Segment metrics such as query/segment/time now report on each spill file for a realtime segment, rather than for the entire segment. This change eliminates the need to materialize results on the heap, which improves the performance of groupBy queries.

#15757

# Improved AND filter performance

Druid query processing now adaptively determines when children of AND filters should compute indexes and when to simply match rows during the scan based on selectivity of other filters. Known as filter partitioning, it can result in dramatic performance increases, depending on the order of filters in the query.

For example, take a query like SELECT SUM(longColumn) FROM druid.table WHERE stringColumn1 = '1000' AND stringColumn2 LIKE '%1%'. Previously, Druid used indexes when processing filters if they are available. That's not always ideal; imagine if stringColumn1 = '1000' matches 100 rows. With indexes, we have to find every value of stringColumn2 LIKE '%1%' that is true to compute the indexes for the filter. If stringColumn2 has more than 100 values, it ends up being worse than simply checking for a match in those 100 remaining rows.

With the new logic, Druid now checks the selectivity of indexes as it processes each clause of the AND filter. If it determines it would take more work to compute the index than to match the remaining rows, Druid skips computing the index.

The order you write filters in a WHERE clause of a query can improve the performance of your query. More improvements are coming, but you can try out the existing improvements by reordering a query. Put indexes that are less intensive to compute such as IS NULL, =, and comparisons (>, >=, <, and <=) near the start of AND filters so that Druid more efficiently processes your queries. Not ordering your filters in this way won’t degrade performance from previous releases since the fallback behavior is what Druid did previously.

#15838

# Centralized datasource schema (alpha)

You can now configure Druid to manage datasource schema centrally on the Coordinator. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment.

To enable this feature, set the following configs:

#15817

# MSQ support for window functions

You can now run window functions in the MSQ task engine using the context flag enableWindowing:true.

In the native engine, you must use a group by clause to enable window functions. This requirement is removed in the MSQ task engine.

#15470 #16229

# MSQ support for Google Cloud Storage

You can now export MSQ results to a Google Cloud Storage (GCS) path by passing the function google() as an argument to the EXTERN function.

#16051

# RabbitMQ extension

A new RabbitMQ extension is available as a community contribution. The RabbitMQ extension (druid-rabbit-indexing-service) lets you manage the creation and lifetime of rabbit indexing tasks. These indexing tasks read events from RabbitMQ through super streams.

As super streams allow exactly once delivery with full support for partitioning, they are compatible with Druid's modern ingestion algorithm, without the downsides of the prior RabbitMQ firehose.

Note that this uses the RabbitMQ streams feature and not a conventional exchange. You need to make sure that your messages are in a super stream before consumption. For more information, see RabbitMQ documentation.

#14137

# Functional area and related changes

This section contains detailed release notes separated by areas.

# Web console

# Improved the Supervisors view

You can now use the Supervisors view to dynamically query supervisors and display additional information on newly added columns.

Surface more information on the supervisors view

#16318

# Search in tables and columns

You can now use the Query view to search in tables and columns.

Use the sidebar to search in tables and columns in Query view

#15990

# Kafka input format

Improved how the web console determines the input format for a Kafka source. Instead of defaulting to the Kafka input format for a Kafka source, the web console now only picks the Kafka input format if it detects any of the following in the Kafka sample: a key, headers, or more than one topic.

#16180

# Improved handling of lookups during sampling

Rather than sending a transform expression containing lookups to the sampler, Druid now substitutes the transform expression with a placeholder. This prevents the expression from blocking the flow.

Change the transform expression to a placeholder

#16234

# Other web console improvements

# General ingestion

# Improved Azure input source

You can now ingest data from multiple storage accounts using the new azureStorage input source schema. For example:

...
    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {
        "type": "azureStorage",
        "objectGlob": "**.json",
        "uris": ["azureStorage://storageAccount/container/prefix1/file.json", "azureStorage://storageAccount/container/prefix2/file2.json"]
      },
      "inputFormat": {
        "type": "json"
      },
      ...
    },
...

#15630

# Added a new config to AzureAccountConfig

The new config storageAccountEndpointSuffix lets you configure the endpoint suffix so that you can override the default and connect to other endpoints, such as Azure Government.

#16016

# Data management API improvements

Improved the Data management API as follows:

# Nested columns performance improvement

Nested column serialization now releases nested field compression buffers as soon as the nested field serialization is complete, which requires significantly less direct memory during segment serialization when many nested fields are present.

#16076

# Improved task context reporting

Added a new field taskContext in the task reports of non-MSQ tasks. The change is backward compatible. The payload of this field contains the entire context used by the task during its runtime.

Added a new experimental interface TaskContextEnricher to enrich context with use case specific logic.

#16041

# Other ingestion improvements

# SQL-based ingestion

# Manifest files for MSQ task engine exports

Export queries that use the MSQ task engine now also create a manifest file at the destination, which lists the files created by the query.

During a rolling update, older versions of workers don't return a list of exported files, and older Controllers don't create a manifest file. Therefore, export queries ran during this time might have incomplete manifests.

#15953

# SortMerge join support

Druid now supports SortMerge join for IS NOT DISTINCT FROM operations.

#16003

# State of compaction context parameter

Added a new context parameter storeCompactionState. When set to true, Druid records the state of compaction for each segment in the lastCompactionState segment field.

#15965

# Selective loading of lookups

We have built the foundation of selective lookup loading. As part of this improvement, KillUnusedSegmentsTask no longer loads lookups.

#16328

# MSQ task report improvements

Improved the task report for the MSQ task engine as follows:

# Other SQL-based ingestion improvements

# Streaming ingestion

# Streaming completion reports

Streaming task completion reports now have an extra field recordsProcessed, which lists all the partitions processed by that task and a count of records for each partition. Use this field to see the actual throughput of tasks and make decision as to whether you should vertically or horizontally scale your workers.

#15930

# Improved memory management for Kinesis

Kinesis ingestion memory tuning config is now simpler:

As part of this change, the following properties have been deprecated:

#15360

# Improved autoscaling for Kinesis streams

The Kinesis autoscaler now considers max lag in minutes instead of total lag. To maintain backwards compatibility, this change is opt-in for existing Kinesis connections. To opt in, set lagBased.lagAggregate in your supervisor spec to MAX. New connections use max lag by default.

#16284 #16314

# Parallelized incremental segment creation

You can now configure the number of threads used to create and persist incremental segments on the disk using the numPersistThreads property. Use additional threads to parallelize the segment creation to prevent ingestion from stalling or pausing frequently as long as there are sufficient CPU resources available.

#13982

# Kafka steaming supervisor topic improvement

Druid now properly handles previously found partition offsets. Prior to this change, updating a Kafka streaming supervisor topic from single to multi-topic (pattern), or vice versa, could cause old offsets to be ignored spuriously.

#16190

# Querying

# Dynamic table append

You can now use the TABLE(APPEND(...)) function to implicitly create unions based on table schemas.

For example, the following queries are equivalent:

SELECT * FROM TABLE(APPEND('table1','table2','table3'))

and

SELECT column1,NULL AS column2,NULL AS column3 FROM table1
UNION ALL
SELECT NULL AS column1,column2,NULL AS column3 FROM table2
UNION ALL
SELECT column1,column2,column3 FROM table3

Note that if the same columns are defined with different input types, Druid uses the least restrictive column type.

#15897

# Added SCALAR_IN_ARRAY function

Added SCALAR_IN_ARRAY function for checking if a scalar expression appears in an array:

SCALAR_IN_ARRAY(expr, arr)

#16306

# Improved PARTITIONED BY

If you use the MSQ task engine to run queries, you can now use the following strings in addition to the supported ISO 8601 periods:

#15836

# Improved catalog tables

You can validate complex target column types against source input expressions during DML INSERT/REPLACE operations.

#16223

You can now define catalog tables without explicit segment granularities. DML queries on such tables need to have the PARTITIONED BY clause specified. Alternatively, you can update the table to include a defined segment granularity for DML queries to be validated properly.

#16278

# Double and null values in SQL type ARRAY

You can now pass double and null values in SQL type ARRAY through dynamic parameters.

For example:

"parameters": [
  {
    "type": "ARRAY",
    "value": [d1, d2, null]
  }
]

#16274

# TypedInFilter filter

Added a new TypedInFilter filter to replace InDimFilter—to improve performance when matching numeric columns.

#16039

TypedInFilter can run in replace-with-default mode.

#16233

# Heap dictionaries clear out

Improved object handling to reduce the chances of running out of memory with Group By queries on high cardinality data.

#16114

# Other querying improvements

# Cluster management

# Improved retrieving active task status

Improved performance of the Overlord API /indexer/v1/taskStatus by serving status of active tasks from memory rather than querying the metadata.

#15724

# Other cluster management improvements

# Data management

# Changes to Coordinator default values

Changed to the default values for the Coordinator service as follows:

#16247

# Compaction completion reports

Parallel compaction task completion reports now have segmentsRead and segmentsPublished fields to show how effective a compaction task is.

#15947

# GoogleTaskLogs upload buffer size

Changed the upload buffer size in GoogleTaskLogs to 1 MB instead of 15 MB to allow more uploads in parallel and prevent the MiddleManager service from running out of memory.

#16236

# Other data management improvements

# Metrics and monitoring

# New unused segment metric

You can now use the kill/eligibleUnusedSegments/count metric to find the number of unused segments of a datasource that are identified as eligible for deletion from the metadata store by the Coordinator.

#15941 #15977

# Kafka emitter improvements

You can now set custom dimensions for events emitted by the Kafka emitter as a JSON map for the druid.emitter.kafka.extra.dimensions property. For example, druid.emitter.kafka.extra.dimensions={"region":"us-east-1","environment":"preProd"}.

#15845

# Prometheus emitter improvements

The Prometheus emitter extension now emits service/heartbeat and zk-connected metrics.

#16209

Also added the following missing metrics to the default Prometheus emitter mapping: query/timeout/count, mergeBuffer/pendingRequests, ingest/events/processedWithError, ingest/notices/queueSize and segment/count.

#16329

# StatsD emitter improvements

You can now configure queueSize,poolSize,processorWorkers, and senderWorkers parameters for the StatsD emitter. Use these parameters to increase the capacity of the StatsD client when its queue size is full.

#16283

# Improved segment/unavailable/count metric

The segment/unavailable/count metric now accounts for segments that can be queried from deep storage (replicaCount=0).

#16020

Added a new metric segment/deepStorage/count to support the query from deep storage feature.

#16072

# Other metrics and monitoring improvements

# Extensions

# Microsoft Azure improvements

You can now use ingestion payloads larger than 1 MB for Azure.

#15695

# Kubernetes improvements

You can now configure the CPU cores for Peons (Kubernetes jobs) using the Overlord property druid.indexer.runner.cpuCoreInMicro.

#16008

# Delta Lake improvements

You can use these filters to filter out data files from a snapshot, reducing the number of files Druid has to ingest from a Delta table.

For more information, see Delta filter object.

#16288

Also added a text box for the Delta Lake filter to the web console. The text box accepts an optional JSON object that is passed down as the filter to the delta input source.

#16379

# Improve performance of LDAP credentials validator

Improved performance of LDAP credentials validator by keeping password hashes in an in-memory cache. This helps avoid re-computation of password hashes, thus speeding up the process of LDAP-based Druid authentication.

#15993

# Upgrade notes and incompatible changes

# Upgrade notes

# Append JsonPath function

The append function for JsonPath for ORC format now fails with an exception. Previously, it would run but not append anything.

#15772

# Kinesis ingestion tuning

The following properties have been deprecated as part of simplifying the memory tuning for Kinesis ingestion:

#15360

# Improved Supervisor rolling restarts

The stopTaskCount config now prioritizes stopping older tasks first. As part of this change, you must also explicitly set a value for stopTaskCount. It no longer defaults to the same value as taskCount.

#15859

# Changes to Coordinator default values

Changed the following default values for the Coordinator service:

#16247

# GoogleTaskLogs upload buffer size

Changed the upload buffer size in GoogleTaskLogs to 1 MB instead of 15 MB to allow more uploads in parallel and prevent the MiddleManager service from running out of memory.

#16236

# Incompatible changes

# Changes to targetDataSource in EXPLAIN queries

Druid 30.0.0 includes a breaking change that restores the behavior for targetDataSource to its 28.0.0 and earlier state, different from Druid 29.0.0 and only 29.0.0. In 29.0.0, targetDataSource returns a JSON object that includes the datasource name. In all other versions, targetDataSource returns a string containing the name of the datasource.

If you're upgrading from any version other than 29.0.0, there is no change in behavior.

If you are upgrading from 29.0.0, this is an incompatible change.

#16004

# Removed ZooKeeper-based segment loading

ZooKeeper-based segment loading is being removed due to known issues. It has been deprecated for several releases. Recent improvements to the Druid Coordinator have significantly enhanced performance with HTTP-based segment loading.

#15705

# Removed Coordinator configs

Removed the following Coordinator configs:

Auto-cleanup of compaction configs of inactive datasources is now enabled by default.

#15705

# Changed useMaxMemoryEstimates for Hadoop jobs

The default value of the useMaxMemoryEstimates parameter for Hadoop jobs is now false.

#16280

# Developer notes

# Dependency updates

The following dependencies have had their versions bumped:

asdf2014 commented 1 month ago

https://github.com/apache/druid/pull/16371 this PR added a feature to the Router page that makes it easier to search DataSource using keywords, which also changed user behavior, so it should be necessary to mention it in the release note I think :smile:

adarshsanjeev commented 1 month ago

@asdf2014 That PR is not present in the Druid 30 branch, as it was merged after the branch cut (and cannot be backported as it is not a bugfix/regression).

asdf2014 commented 1 month ago

@adarshsanjeev Oh, I see :sweat_smile:

frankgrimes97 commented 1 month ago

@adarshsanjeev Could the following in-progress bugfix https://github.com/apache/druid/pull/16550 be considered for 30.0.0? Not sure where you guys are in terms for release timeline/cadence but we'd ideally like to test/verify it on perhaps 30.0.0-rc3 instead of waiting for it to land in 31.0.0 in August/September. We'd also be open to having it considered for a possible 29.0.2 release. Thanks!

adarshsanjeev commented 4 weeks ago

@frankgrimes97 Currently, Druid 30 is at a stage where only regressions can be backported.

acherla commented 4 weeks ago

Testing 30.0.0-rc3 there are a number of regressions im noticing in the UI,

  1. Seems like the "supervisors" UI tab doesnt have a way to paginate all the supervisors anymore that was available in 29.0.1
  2. Filtering in the "supervisors" UI tab by status = running return a UI exception "druidException/Column 'status' not found in any table (line[11], column[7])" - This issue is due to the client making a call to the druid/sql endpoint using a SELECT statement that does not have the STATUS column included. Thus it returns the above error. This is easy enough to fix and merely requires adding the column status to the SELECT query when fetching the status of the supervisors from the sys.supervisors table.
adarshsanjeev commented 4 weeks ago

@acherla Thanks for testing the RC. These seem to be valid regressions, and should be fixed by https://github.com/apache/druid/pull/16571. Once it is backported, I will create a new RC3 build.