apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.93k stars 979 forks source link

Feature for Daffodil DFDL Data querying #2835

Open mbeckerle opened 12 months ago

mbeckerle commented 12 months ago

Use DFDL language to describe data, and then enable Drill to query that data immediately by way of Apache Daffodil's DFDL implementation.

(Creating this so I have a ticket number to cite in commits/PRs)

cgivre commented 12 months ago

Very excited to see this! DFDL + Drill is really a great combo.

mbeckerle commented 9 months ago

PR updated: https://github.com/apache/drill/pull/2836

Not done: Distribution of schemas (or compiled schemas) across drill's computation fabric.

Testing of nillability, and real queries against realistic DFDL schemas are needed.