Open ebyhr opened 6 months ago
cc. @lzlfred @harperjiang
@vkorukanti @lzlfred @harperjiang Could you let me know whether this is incorrect documentation or implementation bug?
@ebyhr the document said that
If the table is on Writer Version 5 or 6: write a metaData action to add the delta.columnMapping.mode table property;
If the table is on Writer Version 7:
write a protocol action to add the feature columnMapping to both readerFeatures and writerFeatures
I believe the behavior you observed is just as described in the doc when we have writer version 5/6, and thus don't consider it a bug.
The writer version is 7 and it mentions "add the feature columnMapping to both readerFeatures and writerFeatures".
{"protocol":{"minReaderVersion":2,"minWriterVersion":7,"writerFeatures":["columnMapping","icebergCompatV1"]}}
Then, I would recommend updating the protocol. I expect readerFeatures always have columnMapping
when it exists in writerFeatures from the current sentence.
The writer version is 7 and it mentions "add the feature columnMapping to both readerFeatures and writerFeatures".
{"protocol":{"minReaderVersion":2,"minWriterVersion":7,"writerFeatures":["columnMapping","icebergCompatV1"]}}
Then, I would recommend updating the protocol. I expect readerFeatures always have
columnMapping
when it exists in writerFeatures from the current sentence.
I think we need to update the protocol to make that paragraph clearer.
When reader version is 2 the readerFeatures doesn't exist, and if it does, it would break the protocol:
For new tables, when a new table is created with a Reader Version up to 2 and Writer Version 7, its protocol action must only contain writerFeatures.
Actually, after running some experiment I realized Spark Delta doesn't allow you to enable column mapping if reader is version 2 and writer is version 7.
But it allows you to create it, so it is inconsistent, I can't say what is the correct behavior, my guess is supporting reader 2 + writer 7 (with columnMapping writerFeatures)
should be the considered compliant with spec.
ALTER TABLE Test SET TBLPROPERTIES (
'delta.minReaderVersion' = '2',
'delta.minWriterVersion' = '7'
);
This fails:
ALTER TABLE Test SET TBLPROPERTIES (
'delta.columnMapping.mode' = 'name'
)
Your current table protocol version does not support changing column mapping modes
using delta.columnMapping.mode.
Required Delta protocol version for column mapping:
Protocol(3,7,[columnMapping],[columnMapping])
Your table's current Delta protocol version:
Protocol(2,7,None,[appendOnly,invariants])
Please enable Column Mapping on your Delta table with mapping mode 'name'.
You can use one of the following commands.
If your table is already on the required protocol version:
ALTER TABLE table_name SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name')
If your table is not on the required protocol version and requires a protocol upgrade:
ALTER TABLE table_name SET TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.minReaderVersion' = '3',
'delta.minWriterVersion' = '7')
This works:
CREATE TABLE Test (Id INT) USING DELTA TBLPROPERTIES ('delta.minWriterVersion' = '7', 'delta.columnMapping.mode' = 'name');
Bug
Which Delta project/connector is this regarding?
Describe the problem
Writer Requirements for Column Mapping mentions:
However,
protocol
entry doesn't havecolumnMapping
inreaderFeatures
field.Steps to reproduce
Create a new table with icebergCompatV1 feature:
Confirm transaction log: