Closed ArijitSinghEDA closed 1 week ago
Hi @ArijitSinghEDA could you provide an example?
Suppose we have a table table1
in namespace ns1
with columns employee_id
, employee_name
, department_id
, and we have a table table2
in namespace ns2
with columns department_id
, department_name
. What I am thinking of is something like this:
from pyiceberg.catalog import load_catalog
catalog = load_catalog("example")
tables = catalog.load_tables(("ns1", "table1"), ("ns2", "table2"))
con = tables.scan().to_duck_db()
df = con.execute(
""""""select * from ns1.table1 join n2.table2 using (department_id)""""""
).fetchdf()
print(df)
@ArijitSinghEDA you can use the iceberg extension in duckdb
It support multiple tables, but you'll need to follow the tips here https://github.com/duckdb/duckdb_iceberg/issues/44
Feature Request / Improvement
A way to load and scan multiple tables (across different namespaces), and use DuckDB query to perform select and join operations (for conditional join)