StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.91k stars 1.79k forks source link

Quoted Table Column Identifiers Not Allowing ! Character #38854

Closed inviscid closed 1 month ago

inviscid commented 10 months ago

According to the MySQL standard a quoted name should be able to contain most ASCII characters except NULL but including !. When I run the following command I get a syntax error:

CREATE TABLE IF NOT EXISTS `bangtest` (
    `Quoted!Name!With!Bangs` varchar(20) NULL, 
    `Quoted$Name$With$Dollars` varchar(20) NULL,
    `SimpleName` varchar(20) NULL 
) PROPERTIES (
    "storage_volume" = "my_volume",
    "datacache.enable" = "true"
);

StarRocks produces a syntax error:

SQL Error [1064] [42000]: Getting analyzing error. Detail message: Incorrect column name 'Quoted!Name!With!Bangs'.

If I remove the ! characters, the table creates as expected.

Steps to reproduce the behavior (Required)

Run the following CREATE TABLE command:

CREATE TABLE IF NOT EXISTS `bangtest` (
    `Quoted!Name!With!Bangs` varchar(20) NULL, 
    `Quoted$Name$With$Dollars` varchar(20) NULL,
    `SimpleName` varchar(20) NULL 
) PROPERTIES (
    "storage_volume" = "my_volume",
    "datacache.enable" = "true"
);

Expected behavior (Required)

Table should create with quoted column names that contain ! character

Real behavior (Required)

A syntax error is produced that indicates the ! character is not allowed in quoted column identifier names

StarRocks version (Required)

3.2.2-269e832

inviscid commented 10 months ago

A lower priority but I also noticed that we cannot create tables with - in the name even though it is a quoted table name. The columns create correctly with a - in the name.

This fails:

CREATE DATABASE `this-is-a-dash-test`;

But should be legit according to MySQL docs.

inviscid commented 8 months ago

Following up on this one to add more context.

According to the MySQL guide I think ! is allowed because it falls in the range. The Unicode value for ! is U+0021.

See reference here: https://dev.mysql.com/doc/refman/8.3/en/identifiers.html

Which specifies the following:

Permitted characters in quoted identifiers include the full Unicode Basic Multilingual Plane (BMP), except U+0000:

inviscid commented 8 months ago

it looks like the column name check specifically excludes the ! character. Based on the comments above the column name check function, wondering if that constraint still applies with all the recent changes in v3.3.

https://github.com/StarRocks/starrocks/blob/4feeec0b92939d25430048b77946226f78540940/fe/fe-core/src/main/java/com/starrocks/sql/analyzer/FeNameFormat.java#L39-L42

packy92 commented 8 months ago

Thanks, we will check it.

inviscid commented 8 months ago

@packy92 Not sure if this is related but thought I'd link here in case it is all part of the same logic: https://github.com/StarRocks/starrocks/issues/40292

packy92 commented 8 months ago

Due to some historical reasons, our delete operation requires that the column name of the predicate in the statement cannot contain some special symbols, such as <, >, =, !. We currently have no plans to relax this restriction. Regarding whether the database name can contain more special symbols, we need to do some test to verify.

inviscid commented 8 months ago

Thanks for checking on the feasibility.

github-actions[bot] commented 1 month ago

We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!