apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.13k stars 842 forks source link

[Bug] Use javaAPI to directly obtain the error result #3679

Open wuwenchi opened 2 days ago

wuwenchi commented 2 days ago

Search before asking

Paimon version

0.8.1

Compute Engine

Flink 1.17.2 and JavaAPI

Minimal reproduce step

  1. use flink sql-client to create and insert table:
    
    CREATE CATALOG paimon_minio WITH (
    'type' = 'paimon',
    'warehouse' = 'file:///mnt/datadisk1/wuwenchi/tmp/warehouse'
    );

CREATE TABLE paimon_minio.db1.tb1 ( s string, ts1 TIMESTAMP(6), ts2 TIMESTAMP(6) WITH LOCAL TIME ZONE ) WITH ( 'write-only' = 'true', 'file.format' = 'parquet' );

insert into paimon_minio.db1.tb1 values ('flink:2024-01-02 10:04:05.123456789',timestamp '2024-01-02 10:04:05.123456789',timestamp '2024-01-02 10:04:05.123456789');


2. then use JavaAPI according to https://paimon.apache.org/docs/0.8/program-api/java-api/#batch-read

```java
    public void read() throws TableNotExistException, IOException {
        Map<String, String> props = new HashMap<>();
        props.put("warehouse", "file:///mnt/datadisk1/wuwenchi/tmp/warehouse");

        CatalogContext ctx = CatalogContext.create(new Options(props));
        Catalog catalog = CatalogFactory.createCatalog(ctx);
        Table table = catalog.getTable(Identifier.create("db1", "tb1"));

        ReadBuilder rb = table.newReadBuilder();
        List<Split> splits1 = rb.newScan().plan().splits();
        TableRead read = rb.newRead();
        RecordReader<InternalRow> reader = read.createReader(splits1);
        reader.forEachRemaining(record -> {
            System.out.println(record.getString(0));

            Timestamp ts1 = record.getTimestamp(1, 6);
            System.out.println(ts1);

            Timestamp ts2 = record.getTimestamp(2, 6);
            System.out.println(ts2);
        });
    }

print in console:

flink:2024-01-02 10:04:05.123456789
2024-01-02T10:04:05.123456
2024-01-02T02:04:05.123456

It is right for s and ts1, but it is wrong for ts2, we expect 2024-01-02T10:04:05.123456 for ts2. Is this 8 hours missing a time zone?

What doesn't meet your expectations?

Use JavaAPI to directly obtain the correct result.

Anything else?

No response

Are you willing to submit a PR?