trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.88k stars 2.86k forks source link

codepoint function does not work on varchar + docs don't mention ascii #4674

Open tooptoop4 opened 3 years ago

tooptoop4 commented 3 years ago

select substr(title,1,1) ,codepoint(cast(substr(title,1,1) as varchar(1)) --,codepoint(substr(title,1,1)) --DOES NOT WORK from ( select 'abc' title
)

If i uncomment that 3rd expression (ie codepoint(substr(title,1,1)) ) it gives below error:

io.prestosql.spi.PrestoException: line 3:2: Unexpected parameters (varchar) for function codepoint. Expected: codepoint(varchar(1)) 
    at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitFunctionCall(ExpressionAnalyzer.java:946)
    at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitFunctionCall(ExpressionAnalyzer.java:333)
    at io.prestosql.sql.tree.FunctionCall.accept(FunctionCall.java:110)
    at io.prestosql.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:27)
    at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:356)
    at io.prestosql.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:294)
    at io.prestosql.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1644)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeExpression(StatementAnalyzer.java:2455)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeSelectSingleColumn(StatementAnalyzer.java:2297)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeSelect(StatementAnalyzer.java:2121)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:1288)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:306)
    at io.prestosql.sql.tree.QuerySpecification.accept(QuerySpecification.java:144)
    at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:323)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:333)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:918)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:306)
    at io.prestosql.sql.tree.Query.accept(Query.java:107)
    at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:323)
    at io.prestosql.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:292)
    at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:83)
    at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:75)
    at io.prestosql.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:237)
    at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:183)
    at io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:737)
    at io.prestosql.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:121)
    at io.prestosql.$gen.Presto_1c5b75e_dirty____20200802_205513_2.call(Unknown Source)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.prestosql.spi.PrestoException: Unexpected parameters (varchar) for function codepoint. Expected: codepoint(varchar(1)) 
    at io.prestosql.metadata.FunctionResolver.resolveFunction(FunctionResolver.java:136)
    at io.prestosql.metadata.MetadataManager.lambda$resolveFunction$25(MetadataManager.java:1501)
    at java.base/java.util.Optional.orElseGet(Unknown Source)
    at io.prestosql.metadata.MetadataManager.resolveFunction(MetadataManager.java:1501)
    at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitFunctionCall(ExpressionAnalyzer.java:934)
    ... 34 more

Docs have

codepoint(string) → integer# Returns the Unicode code point of the only character of string.

so i would expect it to work

also as an aside, maybe the docs should mention something like 'ascii character representation'? i searched the docs for 'ascii' and found nothing, so took a while to find out this was the function i needed (especially since Oracle / postgres / mysql / sqlserver names the function as ASCII() https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions007.htm https://www.postgresql.org/docs/8.2/functions-string.html https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_ascii https://docs.microsoft.com/en-us/sql/t-sql/functions/ascii-transact-sql?view=sql-server-ver15)

findepi commented 3 years ago

@martint can you please define the desired behavior?

shandeep00 commented 3 years ago

@findepi any reason on why we are having varchar(1) as parameter instead of varchar?

martint commented 3 years ago

varchar and varchar(n) are, effectively, two different types, with an implicit coercion from varchar(n) -> varchar. I think it makes sense to allow this function to work on both of them, but make it fail at runtime if varchar contains more than one character.

shandeep00 commented 3 years ago

Thanks @martint for the reply, if it contains more than 1 character we can return unicode for the 1st or leftmost character like other DB's are supported, please correct me if I am missing something here.

martint commented 3 years ago

In that case, it should fail with an error to avoid surprising (incorrect) results due to ignoring input. If the user wants to strip the first character they should do it explicitly with substr.