apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.63k stars 3.56k forks source link

GH-44714: [C++] Keep field metadata for keys and values when importing a map type via the C data interface #44715

Closed paleolimbot closed 3 days ago

paleolimbot commented 1 week ago

Rationale for this change

Import of a map type from the C data interface drops field metadata (including extension type information) which does not happen when importing a map type from IPC or a list of structs. This affects the ability to roundtrip data through pyarrow/Arrow C++ if extension types are not registered.

What changes are included in this PR?

The mechanism to import the map type was changed to align with the method used for IPC import.

Are these changes tested?

Yes.

Are there any user-facing changes?

The current behaviour was surprising/inconsistent, so I think this PR brings it in more line with the current expectation/documentation.

github-actions[bot] commented 1 week ago

:warning: GitHub issue #44714 has been automatically assigned in GitHub to PR creator.

pitrou commented 3 days ago

Thanks @paleolimbot . Do you think there should be a separate issue to add a corresponding integration test?

conbench-apache-arrow[bot] commented 3 days ago

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 152e878a2e79877ec461f96cefc97663f0bc581f.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 7 possible false positives for unstable benchmarks that are known to sometimes produce them.