alibaba / druid

阿里云计算平台DataWorks(https://help.aliyun.com/document_detail/137663.html) 团队出品,为监控而生的数据库连接池
https://github.com/alibaba/druid/wiki
Apache License 2.0
27.95k stars 8.58k forks source link

1.2.13版本,当ODPS sql带有QUALIFY和注释中带有特殊字符的时候,SQLUtils解析报错 #5077

Open roy526 opened 1 year ago

roy526 commented 1 year ago

druid版本:1.2.13 SQL 1:

String sql="select qualify.api_name from t_api qualify where qualify.api_name is not null;";
final List<SQLStatement> sqlStatements = SQLUtils.parseStatements(sql, DbType.odps, SQLParserFeature.KeepComments);
for (SQLStatement sqlStatement : sqlStatements) {
    System.out.println(sqlStatement.toString());
}

错误信息:

com.alibaba.druid.sql.parser.ParserException: ERROR. pos 14, line 1, column 8, token QUALIFY
    at com.alibaba.druid.sql.parser.SQLExprParser.primary(SQLExprParser.java:1315)
    at com.alibaba.druid.sql.parser.SQLExprParser.expr(SQLExprParser.java:95)
    at com.alibaba.druid.sql.dialect.odps.parser.OdpsExprParser.parseSelectItem(OdpsExprParser.java:139)
    at com.alibaba.druid.sql.parser.SQLSelectParser.parseSelectList(SQLSelectParser.java:1052)
    at com.alibaba.druid.sql.dialect.odps.parser.OdpsSelectParser.query(OdpsSelectParser.java:152)

SQL2:

String sql="--asdasdas \u001Aasdad\n" +
                "select api_name\n" +
                "from t_api \n" +
                "where api_name is not null;";
final List<SQLStatement> sqlStatements = SQLUtils.parseStatements(sql, DbType.odps, SQLParserFeature.KeepComments);
for (SQLStatement sqlStatement : sqlStatements) {
    System.out.println(sqlStatement.toString());
}

错误信息:

com.alibaba.druid.sql.parser.ParserException: not supported.pos 17, line 1, column 1, token IDENTIFIER asdad

    at com.alibaba.druid.sql.parser.SQLStatementParser.parseStatementList(SQLStatementParser.java:615)
    at com.alibaba.druid.sql.SQLUtils.parseStatements(SQLUtils.java:596)

哪位大佬帮忙看一下。

roy526 commented 1 year ago

另外同样操作下,还存在注释丢失或者顺序改变的问题 解析之前

-- 1 开头注释 ok
insert into table tmp.abc
-- 1.1 test ok
select -- 2 select注释
-- 3 空白注释
a,b,-- 4 空白注释 ok
c, -- 5 空白注释 ok
d,e,f -- 6 空白注释
-- 7 空白注释
from tmp t1 -- 8 空白注释 ok
-- 9 空白注释 ok
group by -- 10 空白注释 ok
a,b,c,
d,-- 11 空白注释 ok
-- 12 空白注释 ok
e,f
-- 13 空白注释 ok
grouping sets( -- 14 空白注释
-- 15 空白注释
(a,b,c),(b,c,d),(d,e,f))
-- 16 空白注释 ok

解析之后

INSERT INTO TABLE tmp.abc
-- 1.1 test ok
SELECT a -- 2 select注释
    -- 3 空白注释
    , b -- 4 空白注释 ok
    , c -- 5 空白注释 ok
    , d, e, f -- 6 空白注释
    -- 7 空白注释
FROM tmp t1 -- 8 空白注释 ok
-- 9 空白注释 ok
GROUP BY a, 
    b, 
    c, 
    d, 
    e, 
    f, 
    GROUPING SETS ((a, b, c), (b, c, d), (d, e, f))
roy526 commented 1 year ago

INSERT 前的注释会全部丢失

--odps sql 
--********************************************************************--
--author:admin
--create time:2020-10-20 20:01:54
-- 结算单总额
--********************************************************************--
INSERT INTO xxxxx.aaaaaaa PARTITION(pt='${bizdate}')
roy526 commented 1 year ago

https://github.com/alibaba/druid/issues/4889

wenshao commented 1 year ago

你是用在什么场景啊?

roy526 commented 1 year ago

你是用在什么场景啊? 脚本校验,解析出现的问题。

roy526 commented 1 year ago

@wenshao 大佬目前遇到一个比较严重的bug ,带有注释解析后,会把正常语句 拼到注释后边,详见解析后的 --注释3 目前发现的是case when语句,其他语句暂时未知,这种问题执行会报错,如果将换行的部分计算公式 拼到注释后边整个sql的逻辑就会出现错误吧,麻烦看一下

select case when t1.type='JP' and t1.value>0 
            then (case when t1.category_id =29   --注释1
                       then ((0.95*t1.value)*1.04)+0.3
                       when t1.category_id =48 --注释2
                       then ((0.85*t1.value)*1.04)+0.3
                       when t1.category_id in (2,92)   --注释3
                       then ((0.75*t1.value)*1.04)+0.3
                       else ((0.65*t1.value)*1.04)+0.3
                    end  ) end from t_demo t1

通过 SQLUtils.parseStatements(sql, DbType.odps, true)

SELECT CASE 
        WHEN t1.type = 'JP'
            AND t1.value > 0
        THEN 
            CASE 
                WHEN t1.category_id = 29 THEN 0.95 * t1.value * 1.04 + 0.3
                WHEN t1.category_id = 48 THEN 0.85 * t1.value * 1.04 + 0.3
                WHEN t1.category_id IN (2, 92) -- 注释3 THEN 0.75 * t1.value * 1.04 + 0.3
                ELSE 0.65 * t1.value * 1.04 + 0.3
            END
    END
FROM t_demo t1
wenshao commented 1 year ago

https://oss.sonatype.org/content/repositories/snapshots/com/alibaba/druid/1.2.16-SNAPSHOT/ 问题已经修复,请用1.2.16-SNAPSHOT验证

roy526 commented 1 year ago

@wenshao sql语句没问题,但是注释还有点问题,有时候会缺失/顺序混乱/重复注释(不仅限于下边的示例) 如:

select case when t1.type='JP' and t1.value>0 
            then (case when t1.category_id =29   --注释1
                       then ((0.95*t1.value)*1.04)+0.3
                       when t1.category_id =48 --注释2
                       then ((0.85*t1.value)*1.04)+0.3
                       when t1.category_id in (2,92)   -- 注释 3
                       and t1.ce in (2,343)
                       and t1.ced in (2,343) -- 注释 4
                       then ((0.75*t1.value)*1.04)+0.3
                       else ((0.65*t1.value)*1.04)+0.3
                    end  ) end from t_demo t1

解析之后, 注释1、注释2丢失, 注释3和注释4重复。

SELECT CASE 
        WHEN t1.type = 'JP'
            AND t1.value > 0
        THEN 
            CASE 
                WHEN t1.category_id = 29 THEN 0.95 * t1.value * 1.04 + 0.3
                WHEN t1.category_id = 48 THEN 0.85 * t1.value * 1.04 + 0.3
                WHEN t1.category_id IN (2, 92) -- 注释 3 -- 注释 3
                    AND t1.ce IN (2, 343)
                    AND t1.ced IN (2, 343) -- 注释 4 -- 注释 4
                THEN 0.75 * t1.value * 1.04 + 0.3
                ELSE 0.65 * t1.value * 1.04 + 0.3
            END
    END
FROM t_demo t1

另外还有一个问题,add table 语句 解析后 语法错误。

 add table sh.dim_activity_info as activity_info comment '活动信息' -f;

解析之后,缺少 as

ADD TABLE sh.dim_activity_info activity_info COMMENT '活动信息' -f;
roy526 commented 1 year ago

@wenshao 刚又发现一个bug,某一数仓的函数引用,解析后"."成为了":", 如下sc_1.json_array_parser,解析之后变成sc_1:json_array_parser:

SELECT  sc_1.json_array_parser(
                        detail
                        ,isdel
                        ,addtime
                        ,edittime
                    ) AS (detail,isdel,addtime,edittime)
            FROM    (
                        SELECT CAST (detail AS STRING) as detail
                                ,CAST (isdel AS STRING) as isdel
                                ,CAST (addtime AS STRING) as addtime
                                ,CAST (edittime AS STRING) as edittime
                        FROM    sc_2.demo
                        WHERE   pt = '${bizdate}'
                    ) t1

解析之后

SELECT sc_1:json_array_parser(detail, isdel, addtime, edittime)
    AS (detail, isdel, addtime, edittime)
FROM (
    SELECT CAST(detail AS STRING) AS detail, CAST(isdel AS STRING) AS isdel, CAST(addtime AS STRING) AS addtime, CAST(edittime AS STRING) AS edittime
    FROM sc_2.demo
    WHERE pt = '${bizdate}'
) t1
roy526 commented 1 year ago

@wenshao 1.2.16-SNAPSHOT 注释新发现一个bug,sql结尾的带注释时,【;】 被解析成注释了,sql 如下:

INSERT INTO TABLE sc_temp.temp_df_04
SELECT  a.field_1 AS field_1 -- 字段1              
        ,a.field_2 AS field_2 -- 字段2          
from sc_01.t_demo_a a         
WHERE   a.pt = '${bizdate}'
AND     a.biz_type_code IN (212) -- 条件1
AND     a.is_test <> 1 -- 条件2
AND     a.name <> '测试' --条件3 aaa
AND     TO_CHAR(a.create_datetime,'yyyymmdd') <= '${bizdate}' --条件4
;

解析之后

INSERT INTO TABLE sc_temp.temp_df_04
SELECT a.field_1 AS field_1 -- 字段1              
        , a.field_2 AS field_2 -- 字段2          
FROM sc_01.t_demo_a a
WHERE a.pt = '${bizdate}'
        AND a.biz_type_code IN (212) -- 条件1 -- 条件1
        AND a.is_test <> 1 -- 条件2
        AND a.name <> '测试' -- 条件3 aaa
        AND TO_CHAR(a.create_datetime, 'yyyymmdd') <= '${bizdate}' -- 条件4;
wenshao commented 1 year ago

https://github.com/alibaba/druid/releases/tag/1.2.16 请你用新版本,注释有些问题还没完全修复,后续会继续修

roy526 commented 1 year ago

https://github.com/alibaba/druid/releases/tag/1.2.16 请你用新版本,注释有些问题还没完全修复,后续会继续修

@wenshao 发现另外一个bug,group by [keys...] grouping set 语句 toString的时候, keys末尾多加一个分号,导致逻辑异常。sql 如下:

SELECT id,dsa,count(*) from du_sd where id=12 group by id,dsa,ad,ds grouping sets ((id,dsa),(dsa,ad,ds))

toString()后, group by 的 ds字段中多了个','

SELECT id, dsa, count(*)
FROM du_sd
WHERE id = 12
GROUP BY id, 
    dsa, 
    ad, 
    ds, 
    GROUPING SETS ((id, dsa), (dsa, ad, ds))

麻烦看下

roy526 commented 1 year ago

@wenshao 发现一个bug

create table if not exists du_sc.${bizdate}_mer_detail as 
select  distinct 
release_dt
req_no
from  du_sc.demo 
where pt = ${bizdate}
;

解析sql报错

com.alibaba.druid.sql.parser.ParserException: syntax error, error in :' as 
select  distinct 
release_dt
r', expect (, actual SELECT, pos 65, line 2, column 2, token SELECT

    at com.alibaba.druid.sql.parser.SQLParser.printError(SQLParser.java:838)
    at com.alibaba.druid.sql.parser.SQLParser.accept(SQLParser.java:846)
    at com.alibaba.druid.sql.dialect.odps.parser.OdpsCreateTableParser.parseCreateTable(OdpsCreateTableParser.java:126)
    at com.alibaba.druid.sql.parser.SQLCreateTableParser.parseCreateTable(SQLCreateTableParser.java:43)
    at com.alibaba.druid.sql.parser.SQLStatementParser.parseCreate(SQLStatementParser.java:3379)
    at com.alibaba.druid.sql.parser.SQLStatementParser.parseStatementList(SQLStatementParser.java:245)
    at com.alibaba.druid.sql.parser.SQLStatementParser.parseStatementList(SQLStatementParser.java:101)
    at com.alibaba.druid.sql.SQLUtils.parseSingleStatement(SQLUtils.java:622)