When I read an unkeyed amoro table of type mixed-format through Flink, I found that even with the 'scan.startup.mode'='latest' configuration item configured, the full amount of data is read, which is not a situation that meets expectations.
Affects Versions
master
What table formats are you seeing the problem on?
Mixed-Iceberg, Mixed-Hive
What engines are you seeing the problem on?
Flink
How to reproduce
Create an unkeyed table of type mixed-format
create table test_db.test_table (
id int,
name string,
age int
) using mixed_hive;
Write several initial data
insert into test_db.test_table values (1,'name1',10);
insert into test_db.test_table values (2,'name2',10);
Set 'scan.startup.mode' to latest to start reading data from the current latest snapshot
select * from test_db.test_table
/*+ OPTIONS('streaming'='true','arctic.read.mode'='file','source.parallelism' = '1','table.format'='MIXED_HIVE','scan.startup.mode'='latest') */;
But it will read the full amount of data, not as expected.
Relevant log output
No response
Anything else
No response
Are you willing to submit a PR?
[X] Yes I am willing to submit a PR!
Code of Conduct
[X] I agree to follow this project's Code of Conduct
What happened?
When I read an unkeyed amoro table of type mixed-format through Flink, I found that even with the 'scan.startup.mode'='latest' configuration item configured, the full amount of data is read, which is not a situation that meets expectations.
Affects Versions
master
What table formats are you seeing the problem on?
Mixed-Iceberg, Mixed-Hive
What engines are you seeing the problem on?
Flink
How to reproduce
But it will read the full amount of data, not as expected.
Relevant log output
No response
Anything else
No response
Are you willing to submit a PR?
Code of Conduct