apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.63k stars 803 forks source link

DictionaryHandling does not recurse into Map fields #6644

Closed nathanielc closed 3 weeks ago

nathanielc commented 3 weeks ago

Describe the bug

When using the FlightDataEncoder the DictionaryHandling logic does not recuse into fields of a Map type. As a result any dictionary within a Map is not hydrated and can therefore break encoding as multiple dictionaries are sent.

To Reproduce

  1. Using FlightDataEncoder encode two batches where there is a Map with a dictionary for either the key or value field.
  2. An error occurs as the error_on_replacement check inside the FlightIPCEncoder fails as a dictionary is replaced

Expected behavior

The FlightDataEncoder correctly hydrates the dictionaries to avoid the encoding error

Additional context

PR with a fix incoming.

The error string

Dictionary replacement detected when writing IPC file format. Arrow IPC files only support a single dictionary for a given field across all batches
alamb commented 1 week ago

label_issue.py automatically added labels {'arrow'} from #6645

alamb commented 1 week ago

label_issue.py automatically added labels {'arrow-flight'} from #6645