In DuckDB, it is possible to control which columns are selected or generate new columns in a view:
SELECT col1 as renamed_col, 1 as fixed FROM read_parquet('path/to/file.parquet');
We now expose this to the user so they can do the same when creating foreign tables. This is done via the new select option. (I couldn't name it columns because the CSV reader already has a columns option).
CREATE FOREIGN TABLE trips ()
SERVER parquet_server
OPTIONS (files 's3://paradedb-benchmarks/yellow_tripdata_2024-01.parquet', select 'vendorid as vendor_id, 2024 as year, 1 as month');
SELECT * FROM trips LIMIT 1;
vendor_id | year | month
-----------+------+-------
2 | 2024 | 1
(1 row)
Why
This makes it possible to add generated columns to Parquet files that can be used as partition columns (will be documented in a future PR). See #56.
Ticket(s) Closed
What
In DuckDB, it is possible to control which columns are selected or generate new columns in a view:
We now expose this to the user so they can do the same when creating foreign tables. This is done via the new
select
option. (I couldn't name itcolumns
because the CSV reader already has acolumn
s option).Why
This makes it possible to add generated columns to Parquet files that can be used as partition columns (will be documented in a future PR). See #56.
How
Tests
See added test.