There are a couple of issues with the specification of the logical type MAP:
typo: "(...) optional or required and determines whether the list is nullable."
Based on the spec we allow to have a nested key column. Does it make sense? Most engines/libs require to use primitives for keys.
It is clear that value is not required, however I did not find a proper implementation to handle this. Do we want to suggest anything for this case (e.g. value column to be null)?
The Backward-compatibility rules suggests that key and value might not be named according to the spec. But does not say anything about how to identify them. I've seen multiple implementations (e.g. the avro binding of parquet-java), where we simply choose the 0th element as key and the 1st one as value without actually checking their names. It does not seem to be correct according on the spec.
Spec mentions that MAP_KEY_VALUE might appear at the place of MAP but doesn't mention its original purpose to tag key_value level.
Describe the enhancement requested
There are a couple of issues with the specification of the logical type MAP:
optional
orrequired
and determines whether the list is nullable."key
column. Does it make sense? Most engines/libs require to use primitives for keys.value
is not required, however I did not find a proper implementation to handle this. Do we want to suggest anything for this case (e.g. value column to be null)?key
andvalue
might not be named according to the spec. But does not say anything about how to identify them. I've seen multiple implementations (e.g. the avro binding of parquet-java), where we simply choose the0
th element as key and the1
st one as value without actually checking their names. It does not seem to be correct according on the spec.MAP_KEY_VALUE
might appear at the place ofMAP
but doesn't mention its original purpose to tagkey_value
level.