Open magosb opened 7 months ago
So are different cases actually different column identifiers in tsql?
In Snowflake quoted aliases are different column identifiers, however lint throw AL09
, for ex. "ORDER" as "order"
I also have this issue with BigQuery/DBT. The documentation mentions that it should not trigger on case sensitivity changes.
Query example:
Employee_ID AS employee_id
turns into Employee_ID
Before comparing it uppers both the original identifier and the alias. column_identifier.raw_upper == alias_identifier.raw_upper Text mentioning that this should not be the case: Note that this rule does allow self-alias to change case sensitivity.
The fix could be as simple as removing the upper statements, but I have no experience contributing here so not sure if I should open an PR.
Hi @PieterBos94 , for your example:
Query example: Employee_ID AS employee_id turns into Employee_ID
it is the expected behaviour to convert it to Employee_ID
. In BigQuery you can query the column Employee_ID
both as Employee_ID
and employee_id
in your select statements.
Query 1:
SELECT Employee_ID FROM table
Query 2:
SELECT employee_id FROM table
If your BigQuery table column is Employee_ID
, then both Query 1 and Query 2 work to read the column.
If you would like to re-case your column, it can be done using the following:
SELECT Employee_ID AS `employee_id`
FROM table
Hi @dkushnir-picket , could you please share an example where this fails in Snowflake?
The rule is expected to take care of renaming the column name from Case-Sensitive to Case-Insensitive and vice versa.
The rule passes the following query (test_alias_query.sql
) :
SELECT "ORDER" AS "order"
FROM
Table1
sqlfluff lint test_alias_query.sql --dialect snowflake
I ran the same and get
L: 1 | P: 8 | AL09 | Column should not be self-aliased.
Here is my sqlflaff config
[sqlfluff]
dialect = snowflake
templater = dbt
max_line_length = 120
large_file_skip_byte_limit = 0
exclude_rules = ST06, RF04
[sqlfluff:rules:capitalisation.keywords]
capitalisation_policy = lower
[sqlfluff:rules:capitalisation.identifiers]
extended_capitalisation_policy = lower
[sqlfluff:rules:capitalisation.functions]
extended_capitalisation_policy = lower
[sqlfluff:rules:capitalisation.literals]
capitalisation_policy = lower
[sqlfluff:rules:capitalisation.types]
extended_capitalisation_policy = lower
Hi @magosb, the Observed Behaviour of
SELECT
id AS ID,
date AS Date
FROM [...]
being converted to
SELECT
id,
date
FROM [...]
is expected since id
/ID
and date
/ Date
, etc. refer to the same column in the table while querying. Therefore, as per the rule, your query is aliasing the column to itself.
In order to have your desired output, you can directly write the query as:
SELECT
ID,
Date,
Description
FROM [...]
or in the following way using quoted identifiers:
SELECT
id AS "ID",
date AS "Date",
description AS "Description"
FROM [...]
Please share if it is different in TSQL
Thanks @dkushnir-picket It seems I was checking with some old dbt templater.
This is indeed an issue.
SELECT "ORDER" AS "order"
FROM
Table1
The rule doesn't handle the case when both the column and alias are quoted.
@magosb (and any other TSQL users here), would you be able to confirm that in your case your SQL Sever instance has a case-sensitive collation applied for identifiers? It appears that in the docs (https://learn.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-ver16) this is something that is configurable for TSQL, and I'm not sure what the default is.
If your default identifier collation is case sensitive, I can totally understand that your example query is fine as-is:
SELECT
id AS ID,
date AS Date,
description AS Description,
FROM my_table
What I'm particularly curious about is if we have any TSQL users here that have a case-insensitive collation configured, whether the same query is still useful to change the case of the resulting column in the result set (i.e. ID
and id
refer to the same column, but that id as ID
is still useful so that the column appears as ID
in the result, rather than id
).
Thanks in advance, and sorry this is taking a while. I'm discovering that the treatment of identifier case in different SQL dialects can be vastly different.
Search before asking
What Happened
When using sql fluff fix it rewrites our SQL queries and removes "self aliasing" even though the casing is different and even though this is nescessary.
Expected Behaviour
I do not want "ID" to be interpreted as equal to "id" or "Id".
Observed Behaviour
The following is changed by sqlfluff fix:
into:
How to reproduce
Use the above SQL query and run sqlfluff lint or fix on it
Dialect
tsql
Version
Configuration
[tool.sqlfluff.core] large_file_skip_byte_limit = 100000 dialect = "tsql" ignore = "templating"
[tool.sqlfluff.indentation] tab_space_size = "4" indent_unit = "tab"
Are you willing to work on and submit a PR to address the issue?
Code of Conduct