delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.97k stars 365 forks source link

feat: report DataFusion metrics for DeltaScan #2617

Closed alexwilcoxson-rel closed 22 hours ago

alexwilcoxson-rel commented 5 days ago

Description

When building DeltaScan, compute simple metrics about data skipping and include as metrics on DeltaScan. These are exposed to DataFusion via ExecutionPlan trait. Metrics are then visible when you do EXPLAIN ANALYZE on a query for example.

Related Issue(s)

Documentation

See metrics for Parquet scan in DataFusion: https://docs.rs/datafusion/latest/src/datafusion/datasource/physical_plan/parquet/metrics.rs.html#29-50