apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.35k stars 928 forks source link

[spark] Migrate should guard for file format option which may not accurate to the hive table #3494

Closed xuzifu666 closed 3 months ago

xuzifu666 commented 3 months ago

Search before asking

Paimon version

0.9

Compute Engine

spark3.2

Minimal reproduce step

  1. create a hive table using parquet format named test.hive_table;
  2. insert into hive table with several records;
  3. CALL sys.migrate_table(source_type =>'hive', table => 'test.hive_table', options => 'file.format=orc');
  4. insert into test.hive_table;
  5. data file format would not all with parquet;

What doesn't meet your expectations?

Guard for file format option in migrate_table and this can avoid not accurate file format when migrate_table

Anything else?

none

Are you willing to submit a PR?