As part of the GlueingPartitioningOperator changes in #17038, we removed 2 fields from WindowOperatorQueryFrameProcessorFactory: maxRowsMaterializedInWindow and partitionColumnNames. This introduces backward incompatibility when the MSQ controller has the Glueing PR changes, but the worker doesn't:
This PR adds those fields back to ensure backward compatibility.
Even after adding the 2 fields back, if controller has the Glueing PR changes, but workers don't - then we run into another issue where the controller sends the operatorFactoryList with the new operators, but the workers aren't aware of the new operators (GlueingPartitioningOperator and PartitionSortOperator). This causes the following issue:
org.apache.druid.rpc.HttpResponseException: Server error [400 Bad Request]; body: {"error":"Please make sure to load all the necessary extensions and jars with type 'glueingPartition' on 'druid/indexer' service. Could not resolve type id 'glueingPartition' as a subtype of `org.apache.druid.query.operator.OperatorFactory` known type ids = [naivePartition, naiveSort, scan, window] (for POJO property 'operatorList')
This PR handles this by moving the operator transformation logic (NaiveSortOperator -> NaivePartitioningOperator -> WindowOperator to GlueingPartitioningOperator -> PartitionSortOperator -> WindowOperator) from WindowOperatorQueryKit layer to the WindowOperatorQueryFrameProcessor layer. This would allow the worker to either run the older operator chain (if they are on older version, not having the Glueing PR changes), or run the new operator chain (if they have the Glueing PR changes).
Test Plan
To test out the compatibility scenarios, I ran 2 indexers on my local setup, and validated queries for following cases:
Indexer1 (controller) is on older version, indexer2 (some subset of workers) is on newer version.
Indexer1 (controller) is on newer version, indexer2 (some subset of workers) is on older version.
Release note
We are marking 2 fields deprecated for window query execution for MSQ task engine. These will be removed in future releases of Druid, so the upgrade plan should involve this intermediate upgrade stage with these backward compatibility code changes.
This PR has:
[x] been self-reviewed.
[ ] added documentation for new or modified features or behaviors.
[ ] a release note entry in the PR description.
[ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
[ ] added or updated version, license, or notice information in licenses.yaml
[ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
[ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
Description
As part of the GlueingPartitioningOperator changes in #17038, we removed 2 fields from
WindowOperatorQueryFrameProcessorFactory
:maxRowsMaterializedInWindow
andpartitionColumnNames
. This introduces backward incompatibility when the MSQ controller has the Glueing PR changes, but the worker doesn't:This PR adds those fields back to ensure backward compatibility.
Even after adding the 2 fields back, if controller has the Glueing PR changes, but workers don't - then we run into another issue where the controller sends the operatorFactoryList with the new operators, but the workers aren't aware of the new operators (GlueingPartitioningOperator and PartitionSortOperator). This causes the following issue:
This PR handles this by moving the operator transformation logic (
NaiveSortOperator -> NaivePartitioningOperator -> WindowOperator
toGlueingPartitioningOperator -> PartitionSortOperator -> WindowOperator
) fromWindowOperatorQueryKit
layer to theWindowOperatorQueryFrameProcessor
layer. This would allow the worker to either run the older operator chain (if they are on older version, not having the Glueing PR changes), or run the new operator chain (if they have the Glueing PR changes).Test Plan
To test out the compatibility scenarios, I ran 2 indexers on my local setup, and validated queries for following cases:
Release note
We are marking 2 fields deprecated for window query execution for MSQ task engine. These will be removed in future releases of Druid, so the upgrade plan should involve this intermediate upgrade stage with these backward compatibility code changes.
This PR has: