Closed lmatz closed 2 days ago
steps to reproduce:
CREATE SOURCE iceberg_sink_source (
seq_id bigint,
user_id bigint,
user_name varchar)
WITH (
connector = 'datagen',
fields.seq_id.kind = 'sequence',
fields.seq_id.start = '1',
fields.seq_id.end = '10000000',
fields.user_id.kind = 'random',
fields.user_id.min = '1',
fields.user_id.max = '10000000',
fields.user_name.kind = 'random',
fields.user_name.length = '10',
datagen.rows.per.second = '20000'
) FORMAT PLAIN ENCODE JSON;
CREATE TABLE iceberg_sink_table (
seq_id bigint,
user_id bigint,
user_name varchar)
WITH (
connector = 'datagen',
fields.seq_id.kind = 'sequence',
fields.seq_id.start = '1',
fields.seq_id.end = '10000000',
fields.user_id.kind = 'random',
fields.user_id.min = '1',
fields.user_id.max = '10000000',
fields.user_name.kind = 'random',
fields.user_name.length = '10',
datagen.rows.per.second = '20000'
) FORMAT PLAIN ENCODE JSON;
CREATE SINK risingwave_iceberg_sink_test FROM iceberg_sink_table
with (
type='upsert',
primary_key='seq_id',
connector = 'iceberg',
catalog.type = 'glue',
catalog.name = 'awsdatacatalog',
warehouse.path = 's3://hw-data-platform/',
s3.access.key = 'xxx',
s3.secret.key = 'xxxx',
s3.region = 'eu-west-1',
database.name='risingwave_sink',
table.name='iceberg_sink_test',
create_table_if_not_exists=TRUE
);
error in athena is :
ICEBERG_CANNOT_OPEN_SPLIT: Error opening Iceberg split s3://hw-data-platform/data/00000-0-75d35ec1-55ad-419b-99cd-229203daa761-00003.parquet (offset=47993391, length=2189420235): Range [-4, -4 + 4) out of bounds for length 0
Describe the bug
The file can be read by Risignwave's iceberg source. But it cannot be queried by Athena on AWS
Error message/log
No response
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
No response
The version of RisingWave
v2.0.3
Additional context
No response