Open TheNeuralBit opened 4 months ago
See also related discussion in https://github.com/apache/arrow/issues/31723 (it's more specifically about the reading side, but it also notices this strange inconsistency of not writing the metadata when disabling to write ARROW:schema)
Describe the bug, including details regarding any error messages, version, and platform.
When
store_schema
is true theFileWriter
first copies any existing metadata before storing the serialized schema: https://github.com/apache/arrow/blob/8169d6e719453acd0e7ca1b6f784d800cca4f113/cpp/src/parquet/arrow/writer.cc#L537-L542But when
store_schema
is false, theFileWriter
just returns an empty metadata, and custom metadata is not copied: https://github.com/apache/arrow/blob/8169d6e719453acd0e7ca1b6f784d800cca4f113/cpp/src/parquet/arrow/writer.cc#L531-L534Could someone confirm if this is intentional or not? It looks like an oversight to me and I have a patch ready to address it.
Component(s)
Parquet