duckdb / duckdb

DuckDB is an analytical in-process SQL database management system
http://www.duckdb.org
MIT License
23.56k stars 1.87k forks source link

Cannot alias current_timestamp to "current_timestamp" #14275

Open jonathanswenson opened 1 week ago

jonathanswenson commented 1 week ago

What happens?

Little strange here, but noticed this broke with the 1.1.0 -> 1.1.1 change, using the current_timestamp keyword aliased to "current_timestamp" throws an error

on v1.1.1 af39bd0dcf

D select current_timestamp as "current_timestamp";
Binder Error: Column "current_timestamp" referenced that exists in the SELECT clause - but this column cannot be referenced before it is defined

on v1.1.0 fa5c2fe15f

D select current_timestamp as "current_timestamp";
┌────────────────────────────┐
│     current_timestamp      │
│  timestamp with time zone  │
├────────────────────────────┤
│ 2024-10-08 09:07:08.335-07 │
└────────────────────────────┘

Notably postgres allows this:

test=# select current_timestamp as "current_timestamp";
       current_timestamp       
-------------------------------
 2024-10-08 16:04:24.472592+00
(1 row)

To Reproduce

select current_timestamp as "current_timestamp";

OS:

MacOS 14.5 aarch64

DuckDB Version:

1.1.1

DuckDB Client:

cli

Hardware:

No response

Full Name:

Jonathan Swenson

Affiliation:

Omni

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

jonathanswenson commented 1 week ago

Somewhat reminiscent of https://github.com/duckdb/duckdb/issues/12646

zmajeed commented 1 week ago

This bug affects all SQL value functions like current_timestamp including current_time, current_date, current_user - when the alias is the function name itself. It's caused by https://github.com/duckdb/duckdb/pull/13925

A column is bound in ExpressionBinder::BindExpression - https://github.com/duckdb/duckdb/blob/main/src/planner/binder/expression/bind_columnref_expression.cpp

432 // column was not found - check if it is a SQL value function
433 auto value_function = GetSQLValueFunction(col_ref_p.GetColumnName());
434 if (value_function) {
435   return BindExpression(value_function, depth);

But GetSQLValueFunction - https://github.com/duckdb/duckdb/blob/main/src/planner/expression_binder/select_binder.cpp - only returns a value function if the name is not an alias

With no value function to bind to the column, control goes to SelectBinder::BindColumnRef - https://github.com/duckdb/duckdb/blob/main/src/planner/expression_binder/select_binder.cpp - where binding to the alias is attempted but fails because the index of the column for the alias is past any valid boundcolumn index - because the column hasn't been bound yet

26 // binding failed
27 // check in the alias map
28 auto &colref = (expr_ptr.get())->Cast<ColumnRefExpression>();
29 if (!colref.IsQualified()) {
30   auto &bind_state = node.bind_state;
31   auto alias_entry = node.bind_state.alias_map.find(colref.column_names[0]);
32   if (alias_entry != node.bind_state.alias_map.end()) {
33     // found entry!
34     auto index = alias_entry->second;
35     if (index >= node.bound_column_count) {
36       throw BinderException("Column \"%s\" referenced that exists in the SELECT clause - but this column "
37         "cannot be referenced before it is defined",
38         colref.column_names[0]);
39     }
zmajeed commented 1 week ago

It seems GetSQLValueFunction shouldn't fail - this could be done by skipping the alias check if the actual column name - not the passed string - is aliased to itself

Something like GetSQLValueFunction(const ColumnRefExpression& column) could do this

I can open a PR if this looks reasonable