duckdb / pg_duckdb

DuckDB-powered Postgres for high performance apps & analytics.
MIT License
1.59k stars 55 forks source link

Case insensitivity mismatch between duckdb and postgres can cause issues #116

Open JelteF opened 3 months ago

JelteF commented 3 months ago

DuckDB considers identifiers with different casing the same even if they are quoted.

The most minimal example of this is the following schema:

CREATE TABLE a(b int);
CREATE TABLE "A"(c int);
INSERT INTO a VALUES(1);

When using postgres execution this correctly outputs:

> SELECT * FROM a, "A" as aa;
 b │ c
───┼───
(0 rows)

But when enabling duckdb execution both a and "A" are interpreted as a. Resulting in the following incorrect result:

> SELECT * FROM a, "A" as aa;
 b │ b
───┼───
 1 │ 1
(1 row)

This happens even when I set preserver_identifier_case to true on the DuckDB connection.

wuputah commented 3 months ago

Good find. Some related discussion here: https://github.com/duckdb/pg_duckdb/issues/56#issuecomment-2270785294, and related to #43 -- though these are not specific to case insensitivity.

Alzavio commented 2 months ago

This may be related, but in either of the two below, the field name cases aren't maintained in the new postgres table. Using column names in double quotes instead of a wildcard also does not resolve the issue

CREATE TABLE postgres_staging.public.{table_name} AS SELECT * FROM postgres_prod.public.{table_name}

duckdb.sql(f'CREATE TABLE {table_name} AS SELECT * FROM postgres_prod.public.{table_name}')
duckdb.sql(f'CREATE TABLE postgres_staging.public.{table_name} AS SELECT * FROM {table_name}')
pantonis commented 1 week ago

This shouldn't be in the long term milestone. but an important bug to fix.

Y-- commented 1 week ago

@pantonis we'll try to address this asap.