Open hantangwangd opened 2 months ago
@hantangwangd: Good catch.
JSON representation of schema
already have the schema-id
. So, I am not sure why we need one more field schema-id
in the Avro metadata?
Also, if we see partition-spec
it is the JSON representation of the fields of the spec. Not the entire spec. Hence, we need partition-spec-id
.
the schema-id
field seems irrelevant in the spec. Lets see what others think.
Query engine
N/A
Question
Referring to: https://iceberg.apache.org/spec/#manifests, iceberg spec about
manifests
defines that, a manifest file must store the partition spec and other metadata as properties in the Avro file's key-value metadata. Among these key-value metadata,schema-id
has changed fromoptional
torequired
in v2 metadata.However, in any implementation version of
ManifestWriter
, we did not writeschema-id
into the metadata of the corresponding Avro file at all. This looks inconsistent with the spec. And furthermore, seems there is no need to write this key-value property as it is not used anywhere. So should we fix this inconsistency? Or did I misunderstand something?