apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.2k stars 2.16k forks source link

Clarity on serialization format of schema.name-mapping.default in Iceberg metadata #8437

Open shashk opened 1 year ago

shashk commented 1 year ago

Apache Iceberg version

1.3.1 (latest release)

Query engine

Spark

Please describe the bug 🐞

The Iceberg Spec [1] describes the representation of Name Mapping in table metadata as the following: "Name mapping is serialized as a list of field mapping JSON Objects" [2] and "JSON name mapping containing a list of field mapping objects" [3].

The Spec also states that properties in table metadata is a "string to string map" [4].

It seems from NameMappingParser.java that the mappings are indeed represented as a json object serialized to string. However, the description in [2] and [3] may present some confusion to developers.

Could someone help me get clarity on the format of the name mapping in table metadata? And could we update the spec documentation to be more unambiguous on this?

[1] https://iceberg.apache.org/spec [2] https://iceberg.apache.org/spec/#name-mapping-serialization [3] https://iceberg.apache.org/spec/#column-projection [4] https://iceberg.apache.org/spec/#table-metadata-fields

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.