duckdb / duckdb_spatial

MIT License
492 stars 41 forks source link

Column names with spaces fail in WHERE clause [.xlsx] #240

Closed b-schmeling closed 10 months ago

b-schmeling commented 10 months ago

If you load a .xlsx where one of the column names has spaces, you can use it in a SELECT clause but not in a WHERE clause.

Repro: Use this excel file: samsung_sales.xlsx

Run:

INSTALL spatial; LOAD spatial; SELECT * FROM st_read('samsung_sales.xlsx', layer='Orders') WHERE \"Unit Selling Price\" > 14 LIMIT 10;

or this script if preferred: excel_col_space.py.zip

Using DuckDB 0.9.2. Also did not work with installing nightly spatial.

Error:

Invalid Input Error: Attempting to execute an unsuccessful or closed pending query result Error: IO Error: GDAL Error (1): SQL Expression Parsing Error: syntax error, unexpected identifier, expecting end of string. Occurred around : Unit Selling Price>14 AND Unit Selling Price ^ [SQL: SELECT * FROM st_read('samsung_sales.xlsx', layer='Orders') WHERE "Unit Selling Price" > 14 LIMIT 10;]

aborruso commented 10 months ago

I confirm it :(

Maxxen commented 10 months ago

Hi! Thanks for reporting this issue. It seems like we need to add additional quoutes/escapes when pushing down filters into the GDAL scan. Will look into fixing it!