Open Tangeroooo opened 3 weeks ago
Hi, I am able to verify the error you got. If I understand it correctly, your use case needs to select columns that are not included in the groupby clause and also not included in the aggregation function. Lots of query engines also do not support it. I think it might be necessary to restructure the query in this case, and here's some reference that might be helpful: link 😄
Describe the bug
Return an error like below message.
Error in response: DataFusionError("Internal error: Invalid aggregate expression 'Column(Column { relation: Some(Bare { table: \"?table?\" }), name: \"video_id\" })'.\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker")
To Reproduce
for record_batch in reader { let rb = record_batch?; record_batches.push(rb); }
let columns = reader.schema().fields().iter().map(|f| { col(f.name().clone()) }).collect::<Vec<_>>();
let group_by_columns = group_by_columnsnames .iter() .map(|name: &String| col(name)) .collect::<Vec<>>();
let ctx = SessionContext::new(); let df = ctx.read_batches(record_batches)?;
// select * from tab group by col1, col2; df.aggregate(group_by_columns, columns)?.collect().await?
Expected behavior
Run group by without aggregation function.
Additional context
No response