apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.16k stars 855 forks source link

[spark] Migrate should guard for file format option which may not accurate to the hive table #3495

Closed xuzifu666 closed 1 month ago

xuzifu666 commented 1 month ago
  1. create a hive table using parquet format named test.hive_table;
  2. insert into hive table with several records;
  3. CALL sys.migrate_table(source_type =>'hive', table => 'test.hive_table', options => 'file.format=orc');
  4. insert into test.hive_table;
  5. data file format would not all with parquet;

Linked issue: https://github.com/apache/paimon/issues/3494

Tests

API and Format

Documentation