[core] Fix partition column generate wrong partition spec

Purpose

Paimon uses .toString to generate partition value, which is not accurate for some data types. like date/binary. Say, Spark engine would use a Cast to convert a partition object to string value. So this pr changes to use cast to generate partition value.

Add a new config partition.legacy-name to support switch to use previous toString behavior, and by default use the legacy behavior(.toString).

An example that using binary type partition column would cause failure.

CREATE TABLE pt (
    id BIGINT,
    c1 STRING
) using paimon
PARTITIONED BY (day binary);

insert into table pt values(1, 'a', cast('2021' as binary));
select * from pt;

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1) (192.168.0.102 executor driver): java.io.FileNotFoundException: File 'warehouse/default.db/pt/day=%5BB@4a045a11/bucket-0/data-91c064a3-a0a1-4042-9d5a-cc82a23af7ff-0.parquet' not found, Possible causes: 1.snapshot expires too fast, you can configure 'snapshot.time-retained' option with a larger value. 2.consumption is too slow, you can improve the performance of consumption (For example, increasing parallelism).

apache / paimon

[core] Fix partition column generate wrong partition spec #4349

Purpose

Tests

API and Format

Documentation