vibhatha / pyiceberg_substrait

Apache License 2.0
0 stars 0 forks source link

Updating a Plan with Joins #1

Open vibhatha opened 1 year ago

vibhatha commented 1 year ago

Here the challenge is that how to replace the file paths in the recursion tree. This must be tested, so in theory, we would have a plan with multiple joins. For particular join, there are the left and the right.

Left and right have ReadRel and we should have a way to update that plan.

For instance we could do the following;


path_dictionary = {
"left_table": [
"f1.parquet",
"f2.parquet"
],

"right_table": [
"f3.parquet",
"f4.parquet"
]
}

So what would happen is, we will have a Substrait plan, then we go through the recursion tree, then for each table we lookup the iceberg_catalog and load the said table and extract the files via the pyiceberg.table.Table.scan().