Is your feature request related to a problem or challenge?
In InfluxDb we have several optimizer passes that rearrange ParquetExec (e.g. split the files up into multiple new ParquetExecs or break a single ParquetExec up into multiple ones)
I would like an easier way to go from ParquetExec --> ParquetExecBuilder which can be manipulated and then turned back into a parquet exec
Describe alternatives you've considered
I suggest:
let parquet_exec: ParquetExec = ...;
// convert parquet_exec into a builder
let mut builder = ParquetExecBuilder::from(parquet_exec); // maybe also support parquet_exec.into_builder()
// ... modify builder
let paruet_exec = builder.build() // turn back to ParquetExec
Bonus points if we can make it work for an Arc<ParquetExec> too:
let parquet_exec: Arc<ParquetExec> = ...;
// convert parquet_exec into a builder
let mut builder = ParquetExecBuilder::from(parquet_exec);
...
Is your feature request related to a problem or challenge?
In InfluxDb we have several optimizer passes that rearrange ParquetExec (e.g. split the files up into multiple new
ParquetExec
s or break a singleParquetExec
up into multiple ones)Doing this at the moment is somewhat cumbersome, for example: https://github.com/influxdata/influxdb3_core/blob/1eaa4ed5ea147bc24db98d9686e457c124dfd5b7/iox_query/src/physical_optimizer/predicate_pushdown.rs#L55-L77
Describe the solution you'd like
I would like an easier way to go from
ParquetExec
-->ParquetExecBuilder
which can be manipulated and then turned back into a parquet execDescribe alternatives you've considered
I suggest:
Bonus points if we can make it work for an
Arc<ParquetExec>
too:Additional context
This is the root usecase behind @NGA-TRAN 's proposal in https://github.com/apache/datafusion/pull/12726