Closed ion-elgreco closed 3 months ago
If you can post a PhysicalPlan
for one of the failing test cases. That might help to understand the problem better, then we can replicate it in this repo. As an educated guess, it might be related to the
fn children
implementation for one of the custom ExecutionPlan
s in your repo (if you have those).@mustafasrepo This is the childrends function, I'll get grab a plan in a bit
pub struct DeltaScan {
/// The URL of the ObjectStore root
pub table_uri: String,
/// Column that contains an index that maps to the original metadata Add
pub config: DeltaScanConfig,
/// The parquet scan to wrap
pub parquet_scan: Arc<dyn ExecutionPlan>,
/// The schema of the table to be used when evaluating expressions
pub logical_schema: Arc<ArrowSchema>,
}
fn children(&self) -> Vec<Arc<dyn ExecutionPlan>> {
vec![self.parquet_scan.clone()]
}
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> DataFusionResult<Arc<dyn ExecutionPlan>> {
ExecutionPlan::with_new_children(self.parquet_scan.clone(), children)
}
@ion-elgreco try changing the children function to return an empty vec![]
rather than the clone of the self.parquet_scan()
. solved 29/32 test issues for me.
I think, changing implementation below
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> DataFusionResult<Arc<dyn ExecutionPlan>> {
ExecutionPlan::with_new_children(self.parquet_scan.clone(), children)
}
to something like
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>> {
// Make sure to received single child
// It should be updated new parquet_scan.
assert_eq!(children.len(), 1);
Arc::new(DeltaScan {
table_uri: self.table_uri.clone(),
config: self.config.clone(),
parquet_scan: children[0].clone(),
logical_schema: self.logical_schema.clone(),
})
// ExecutionPlan::with_new_children(self.parquet_scan.clone(), children)
}
might solve the problem (Just brainstorming, I didn't try it). I am not sure, what ExecutionPlan::with_new_children(self.parquet_scan.clone(), children)
does. However, this line might be the root cause of the problem.
@mustafasrepo thanks for the tips! It's resolved now
Describe the bug
A couple folks including me tried bumping Datafusion to v36 in Delta-RS but to no avail since we all are running in this
Wrong number of children
error. Which at that time I read is something in the optimizer?Here are failing actions: https://github.com/delta-io/delta-rs/actions/runs/8339625658/job/22821968902#step:4:1 https://github.com/delta-io/delta-rs/actions/runs/8472960661/job/23243814572#step:4:1
I'm curious if you guys would know the issue, perhaps it's something that we missed
To Reproduce
Not sure, but our tests will failed in this branch: https://github.com/delta-io/delta-rs/pull/2214
Expected behavior
No response
Additional context
No response