duckdb / sqlite_scanner

DuckDB extension to read and write to SQLite databases
MIT License
189 stars 19 forks source link

Column names are not preserved when doing sqlite_scan #57

Closed nitinprakash96 closed 8 months ago

nitinprakash96 commented 11 months ago

What happens?

This is a speacial case when data is imported with casting enbaled. Without casting in CREATE statement, the table names are preserved just fine.

To Reproduce

Create a dummy sqlite file data.sqlite3 and create a table as follows:

CREATE TABLE foo (
    user_id int8 NULL,
    measurement_date date NULL,
    days int4 NULL
)
D ATTACH './data.sqlite3' AS foo_sqlite (TYPE SQLITE);
D CREATE TABLE foo AS 
    SELECT
      user_id::BIGINT,
      measurement_date::DATE,
      COLUMNS(* EXCLUDE (user_id, measurement_date))::INTEGER 
    FROM foo_sqlite.foo;
D pragma table_info('foo');
┌───────┬──────────────────────────────────────────────────────────┬─────────┬─────────┬────────────┬─────────┐
│  cid  │                           name                           │  type   │ notnull │ dflt_value │   pk    │
│ int32 │                         varchar                          │ varchar │ boolean │  varchar   │ boolean │
├───────┼──────────────────────────────────────────────────────────┼─────────┼─────────┼────────────┼─────────┤
│     0 │ CAST(user_id AS BIGINT)                                  │ BIGINT  │ false   │            │ false   │
│     1 │ CAST(measurement_date AS DATE)                           │ DATE    │ false   │            │ false   │
│     2 │ CAST(sqlite_scan."days" AS INTEGER)                      │ INTEGER │ false   │            │ false   │
├───────┴──────────────────────────────────────────────────────────┴─────────┴─────────┴────────────┴─────────┤

D select measurement_date from foo;
Error: Binder Error: Referenced column "measurement_date" not found in FROM clause!
Candidate bindings: "foo.CAST(measurement_date AS DATE)"
LINE 1: select measurement_date from foo;

OS:

MacOS

SQLite Version:

3.37.0

DuckDB Version:

0.8.1

DuckDB Client:

CLI

Full Name:

Nitin Prakash

Affiliation:

Holmusk

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Mytherin commented 8 months ago

Thanks for the report!

This is intended behavior as part of regular DuckDB and not related to the SQLite scanner. If no alias is provided, the full expression is used as a column name. For example:

CREATE TABLE integers(i INT);
INSERT INTO integers VALUES (42);
SELECT i::BIGINT FROM integers;
┌───────────────────┐
│ CAST(i AS BIGINT) │
│       int64       │
├───────────────────┤
│                42 │
└───────────────────┘

As you can see, the cast is included as part of the column name, as would any other expression. The solution is to add an explicit alias:

SELECT i::BIGINT AS i FROM integers;
┌───────┐
│   i   │
│ int64 │
├───────┤
│    42 │
└───────┘