apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.95k stars 979 forks source link

DRILL-8502: some boot options with drill.exec.options prefix are missed in configuration options #2923

Closed rymarm closed 4 months ago

rymarm commented 4 months ago

DRILL-8502: Some boot options with drill.exec.options prefix are missed in configuration options

Description

This PR removes 3 unused properties and 1 duplicate property and makes drill.exec.options.drill.exec.testing.controls visible all the time, not only when assertions mode is enabled.

During an investigation of the Drill configuration system, I noticed Drill has several system options(drill.exec.options.* properties), which are absent in SystemOptionManager, but present in DrillConfig:

apache drill> select name from sys.boot where name like 'drill.exec.options%' AND name not in (select concat('drill.exec.options.', name) from sys.internal_options union all select concat('drill.exec.options.', name) from sys.options);
+-------------------------------------------------------+
|                         name                          |
+-------------------------------------------------------+
| drill.exec.options.drill.exec.testing.controls        |
| drill.exec.options.exec.hashagg.max_batches_in_memory |
| drill.exec.options.exec.hashagg.num_rows_in_batch     |
| drill.exec.options.exec.hashjoin.mem_limit            |
| drill.exec.options.exec.return_result_set_for_ddl     |
+-------------------------------------------------------+

I've analyzed the Drill code and concluded:

  1. drill.exec.options.drill.exec.testing.controls - is set to empty in drill-module.conf in java-exec jar. This property becomes available only when an assertions mode is enabled. All the other time this option is never used and should not be initialized. Even with enabled assertions mode a null value is acceptable. DRILLBIT_CONTROLS_VALIDATOR along with its default value is added to SystemOptionManager only when an assertions mode is enabled, but all the other time, drill.exec.options.drill.exec.testing.controls is present only in DrillConfig. I think it's a little confusing, that a system option may be present, but may not. So I reverted a change that adds logic of adding the validator when the assertions mode is enabled. Commit which brings this logic: https://github.com/apache/drill/commit/2af6340da24cf40cc3512b728a460b84eb610dce#diff-f17deea2cf176841c1056a79613481f2b9c42f29eaa151958eaabf14169d3e24R118-R120
  2. drill.exec.options.exec.hashagg.max_batches_in_memory - is set to 65536 in drill-module.conf in java-exec jar. This property is never used.
  3. drill.exec.options.exec.hashagg.num_rows_in_batch - is set to 128 in drill-module.conf in java-exec jar. This property is never used.
  4. drill.exec.options.exec.hashjoin.mem_limit - is set to 0 in drill-module.conf in java-exec jar. This property is never used.
  5. drill.exec.options.exec.return_result_set_for_ddl - is set to true. This property is probably a typo of drill.exec.options.exec.query.return_result_set_for_ddl.
rymarm commented 4 months ago

@jnturton @cgivre What do you say about this?

cgivre commented 4 months ago

@jnturton Do you want to weigh in on this one?

cgivre commented 4 months ago

@rymarm Would you mind please creating a JIRA issue for this?

rymarm commented 4 months ago

@cgivre yes, sure. I've updated the PR with Jira ticket information.

jnturton commented 4 months ago

Sorry I missed this at the time. Thanks for the cleanup.