Canner / WrenAI

🚀 Open-source SQL AI Agent for Text-to-SQL. Make Text2SQL Easy! 🙌
https://getwren.ai/oss
GNU Affero General Public License v3.0
1.73k stars 155 forks source link

Support non-English column names (Ibis server error when preview data) #474

Closed shizidushu closed 2 months ago

shizidushu commented 2 months ago

Describe the bug I have some unusual data which column is in chinese (mysql). When I click "Preview Data". It says:

Ibis server error
Server error '500 Server Error' for url 'http://wren-engine:8080/v2/mdl/dry-plan' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

I check the logs of wren engine (docker container), it errors:

2024-07-02T13:38:30.066Z        WARN    ForkJoinPool.commonPool-worker-2        io.wren.main.web.WrenExceptionMapper    Exception, type: class java.util.concurrent.CompletionException, message: io.trino.sql.parser.ParsingException: line 1:41: mismatched input '年'. Expecting: ')', ',', '.', 'AS', 'CROSS', 'EXCEPT', 'FETCH', 'FOR', 'FULL', 'GROUP', 'HAVING', 'INNER', 'INTERSECT', 'JOIN', 'LEFT', 'LIMIT', 'MATCH_RECOGNIZE', 'NATURAL', 'OFFSET', 'ORDER', 'RIGHT', 'TABLESAMPLE', 'UNION', 'WHERE', 'WINDOW', <EOF>, <identifier>
java.util.concurrent.CompletionException: io.trino.sql.parser.ParsingException: line 1:41: mismatched input '年'. Expecting: ')', ',', '.', 'AS', 'CROSS', 'EXCEPT', 'FETCH', 'FOR', 'FULL', 'GROUP', 'HAVING', 'INNER', 'INTERSECT', 'JOIN', 'LEFT', 'LIMIT', 'MATCH_RECOGNIZE', 'NATURAL', 'OFFSET', 'ORDER', 'RIGHT', 'TABLESAMPLE', 'UNION', 'WHERE', 'WINDOW', <EOF>, <identifier>
        at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
        at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: io.trino.sql.parser.ParsingException: line 1:41: mismatched input '年'. Expecting: ')', ',', '.', 'AS', 'CROSS', 'EXCEPT', 'FETCH', 'FOR', 'FULL', 'GROUP', 'HAVING', 'INNER', 'INTERSECT', 'JOIN', 'LEFT', 'LIMIT', 'MATCH_RECOGNIZE', 'NATURAL', 'OFFSET', 'ORDER', 'RIGHT', 'TABLESAMPLE', 'UNION', 'WHERE', 'WINDOW', <EOF>, <identifier>
        at io.trino.sql.parser.ErrorHandler.syntaxError(ErrorHandler.java:109)
        at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
        at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
        at org.antlr.v4.runtime.DefaultErrorStrategy.reportInputMismatch(DefaultErrorStrategy.java:327)
        at org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:139)
        at io.trino.sql.parser.SqlBaseParser.relationPrimary(SqlBaseParser.java:10044)
        at io.trino.sql.parser.SqlBaseParser.aliasedRelation(SqlBaseParser.java:9587)
        at io.trino.sql.parser.SqlBaseParser.patternRecognition(SqlBaseParser.java:8832)
        at io.trino.sql.parser.SqlBaseParser.sampledRelation(SqlBaseParser.java:8510)
        at io.trino.sql.parser.SqlBaseParser.relation(SqlBaseParser.java:8181)
        at io.trino.sql.parser.SqlBaseParser.querySpecification(SqlBaseParser.java:7097)
        at io.trino.sql.parser.SqlBaseParser.queryPrimary(SqlBaseParser.java:6829)
        at io.trino.sql.parser.SqlBaseParser.queryTerm(SqlBaseParser.java:6629)
        at io.trino.sql.parser.SqlBaseParser.queryNoWith(SqlBaseParser.java:6124)
        at io.trino.sql.parser.SqlBaseParser.query(SqlBaseParser.java:5297)
        at io.trino.sql.parser.SqlBaseParser.statement(SqlBaseParser.java:2670)
        at io.trino.sql.parser.SqlBaseParser.singleStatement(SqlBaseParser.java:305)
        at io.trino.sql.parser.SqlParser.invokeParser(SqlParser.java:145)
        at io.trino.sql.parser.SqlParser.createStatement(SqlParser.java:85)
        at io.wren.base.sqlrewrite.Utils.parseSql(Utils.java:63)
        at io.wren.base.sqlrewrite.WrenPlanner.rewrite(WrenPlanner.java:48)
        at io.wren.base.sqlrewrite.WrenPlanner.rewrite(WrenPlanner.java:43)
        at io.wren.main.PreviewService.lambda$dryPlan$1(PreviewService.java:102)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        ... 6 more
Caused by: org.antlr.v4.runtime.InputMismatchException
        at io.trino.sql.parser.SqlParser$2.recoverInline(SqlParser.java:128)
        at org.antlr.v4.runtime.Parser.match(Parser.java:208)
        at io.trino.sql.parser.SqlBaseParser.relationPrimary(SqlBaseParser.java:9899)
        ... 24 more

Expected behavior No error

Desktop (please complete the following information):

Wren AI Information

wwwy3y3 commented 2 months ago

hi @shizidushu thanks for reporting this issue!

@goldmedal @grieve54706 could you guys take a look ?

goldmedal commented 2 months ago

@goldmedal @grieve54706 could you guys take a look ?

Thanks @shizidushu @wwwy3y3 It's an issue with the Chinese column name. I have synced with @onlyjackfrost, and I guess he will have a PR to fix this soon.

onlyjackfrost commented 2 months ago

@shizidushu Thanks for reporting this! Can you reproduce this issue then dump the wren-ui-service & wren-engine-service logs for us? We need to check the generated MDL for more information.

goldmedal commented 2 months ago

I think we also need wren-engine-ibis:0.5.2 log. Thanks.

shizidushu commented 2 months ago

@onlyjackfrost @goldmedal Here is the log from docker compose up.

When submit a query: wrenai-log-new.log

When preview data: wren-error-when-preview-data.log

UPDATE: I use custom llm and embedding model. It is very likely related.

goldmedal commented 2 months ago

Hi @shizidushu, I think there are some issues with supporting non-English column names. Currently, Wren AI can't use Chinese column names. I'll modify the title and keep this issue open until we support it. Thanks

cyyeh commented 2 months ago

@shizidushu we've found the root cause, and will release the fixed version in the next release. I'll close it after the new release is arrived.

cyyeh commented 2 months ago

@shizidushu hi, the new release is here: https://github.com/Canner/WrenAI/releases/tag/0.7.0. welcome to try out!