Open johanl-db opened 2 weeks ago
Will be in future an option to change the column type of a table from int to string without overwriting the entire table? Unless such an option is now available (but I don't remember that)
Will be in future an option to change the column type of a table from int to string without overwriting the entire table? Unless such an option is now available (but I don't remember that)
There's no plan currently to support other type changes than the ones mentioned in the PR description.
Converting values when reading from a table that had one of these widening type changes applied can be easily done directly in the Parquet reader, but other type changes are harder either because:
long
-> int
or long
-> float
.float
-> string
: how many significant digits should be displayed? For decimal
-> string
: should the value be padded with 0s to match the precision/scale of the value. Even for int
-> string
, we could ask if the raw bytes of the initial value should be returned as string or the value should be formatted as UTF8.
The type changes added in this PR only work with Spark 4.0 / master which contains the required changes to Parquet readers to be able to read the data after applying the type changes.
Description
Extend the list of supported type changes for type widening to include changes that can be supported with Spark 4.0:
How was this patch tested?
Adding test cases for the new type changes in the existing type widening test suites
Does this PR introduce any user-facing changes?
Yes: allow using the listed type changes with type widening, either via
ALTER TABLE CHANGE COLUMN TYPE
or during schema evolution in MERGE and INSERT.