Open aditi-pandit opened 1 year ago
@aditi-pandit Having velox.properties is more explicit in my opinion. Do you see any downside?
Are operators of Presto expected to change this config value? What does mutable-config=true
mean?
If operators are not expected to change this value, it could be fine. But for any configuration where operators are expected to make changes to it, it's preferable to use the same approach as Java, and use a unified config (and avoid specific references to underlying libraries).
@aditi-pandit Having velox.properties is more explicit in my opinion. Do you see any downside?
The Velox queryConfig has many properties for specific operator behavior like "spill_enabled", "aggregation_spill_enabled";"join_spill_enabled"; "order_by_spill_enabled"; "aggregation_spill_memory_threshold"; "join_spill_memory_threshold"; These are applicable for Presto java also.
The source of truth for such properties has become confusing now.
In Presto java these would make it from config.properties -> SystemSessionProperties class used during task creation.
Now velox.properties will initialize into BaseVeloxQueryConfig at PrestoServer initialization. But then the value in config.properties makes it to SystemSessionProperties that override the VeloxQueryConfig at task creation ?
@spershin, @amitkdutta : Is there some misunderstanding here ?
Reference
Are operators of Presto expected to change this config value? What does
mutable-config=true
mean?Since https://github.com/prestodb/presto/pull/19700 v1/operation end point can be used to change Velox Query config on the fly. mutable-config is used to allow that change.
If operators are not expected to change this value, it could be fine. But for any configuration where operators are expected to make changes to it, it's preferable to use the same approach as Java, and use a unified config (and avoid specific references to underlying libraries).
@tdcmeehan : Its confusing to me because there are existing Presto configuration "like spill_enabled, aggregation_spill_enabled, etc" also applicable for Velox Query config. With the new velox.properties file the source of truth for the property could be confusing.
Folks, where are we with this discussion. It seems strange to introduce velox.properties file. Why not include all worker configs in a single file (node.properties) and translate these to Velox Config as needed at runtime?
CC: @xiaoxmeng @amitkdutta @spershin @pranjalssh @kewang1024
@mbasmanova I assume you meant config.properties. Having everything in the same file is definitely helpful for deployment. In such cases, we might need to add a velox
prefix to separate velox specific configs, currently the file itself does the distinction.
CC: @spershin @xiaoxmeng
@amitkdutta having a single config file with velox prefix (convert it at Prestissimo layer) sounds a good approach to me. We plan to do similar things for session property for native support @mbasmanova
It would be nice to also add logic to Prestissimo worker to fail fast if configuration file contains unsupported properties. Similarly, it would be nice to add logic to fail if received unsupported session properties.
It would be nice to also add logic to Prestissimo worker to fail fast if configuration file contains unsupported properties. Similarly, it would be nice to add logic to fail if received unsupported session properties.
@mbasmanova , @amitkdutta , @xiaoxmeng : Could these properties be surfaced to the co-ordinator in any way ? The co-ordinator should not be sending such properties to Prestissimo workers also. If we validate the properties at the co-ordinator it would give us a fast failure path.
Recently, an additional configuration file called velox.properties was added in Prestissimo that could be used to initialize the Velox Query config defaults.
Is this extra file needed ?
Could we add velox properties in config.properties with velox prefix instead ?
There are multiple properties like "spill_enabled", "aggregation_spill_enabled";"join_spill_enabled"; "order_by_spill_enabled"; "aggregation_spill_memory_threshold"; "join_spill_memory_threshold" that carry over from java that are now available from both configs. The source of truth is confusing then.
@amitkdutta @majetideepak @tdcmeehan @spershin @mbasmanova @xiaoxmeng