apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1.1k stars 348 forks source link

[Bug report] Load Postgresql Array data in Trino failed #5660

Open danhuawang opened 4 days ago

danhuawang commented 4 days ago

Version

main branch

Describe what's wrong

Load the PG table including Array data encounter unknown field error. But gravitino doc says we support Array data type in PG catalog.

trino:public> select * from tb09;
Query 20241123_083504_01474_zc3ii failed: Unknown field tb09.score:array(integer)
java.lang.IllegalArgumentException: Unknown field tb09.score:array(integer)
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:218)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeTableOutputFields(StatementAnalyzer.java:2657)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:2295)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:521)
        at io.trino.sql.tree.Table.accept(Table.java:60)
        at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:540)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeFrom(StatementAnalyzer.java:4869)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:3062)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:521)
        at io.trino.sql.tree.QuerySpecification.accept(QuerySpecification.java:155)
        at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:540)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:548)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1558)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:521)
        at io.trino.sql.tree.Query.accept(Query.java:118)
        at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
        at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:540)
        at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:500)
        at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:489)
        at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:97)
        at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:86)
        at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:274)
        at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:209)
        at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:850)
        at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:153)
        at io.trino.$gen.Trino_435____20241123_043415_2.call(Unknown Source)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Error message and/or stacktrace

see above

How to reproduce

In PG data source, I have the original data as following:

 create table tb09(id int,score INTEGER[]);

 insert into tb09 (id, score) values ('1',ARRAY[1, 2, 3, 4, 5]);

  select * from tb09;
 id |    score    
----+-------------
  1 | {1,2,3,4,5}

Then query these data in Trino

call gravitino.system.create_catalog(
    'gt_pg1',
    'jdbc-postgresql',
    map(
        array['jdbc-url', 'jdbc-user', 'jdbc-password', 'jdbc-database', 'jdbc-driver', 'trino.bypass.join-pushdown.strategy', 'trino.bypass.postgresql.array-mapping'],
        array['jdbc:postgresql://10.20.31.19:5432/db', 'postgres', 'postgres', 'db', 'org.postgresql.Driver', 'EAGER', 'AS_ARRAY']
    )
);

 use  gt_pg1.public;

 select * from tb09;

Additional context

No response

danhuawang commented 4 days ago

@diqiu50 Can you help check this issue? Thanks.