apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.33k stars 684 forks source link

Improve ability of FlightDataEncoder to respect max_flight_data_size for certain data types (strings, dictionaries, etc) #3478

Open alamb opened 1 year ago

alamb commented 1 year ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Some implementations of gRPC, such as golang have a default max message size that is "relatively small" (4MB) and the clients will generate errors if they receive larger messages.

The FlightDataEncoder has a mechanism (link) to try and avoid this problem by heuristically slicing RecordBatchs into smaller parts to limit their size. This works well for primitive arrays but does not work well for other cases as we have found upstream in IOx:

  1. DataType::Utf8 (only the offsets are sliced, the underlying string data is not sliced)
  2. Dictionaries (the dictionary itself is not changed, so if the dictionary is large it will be repeated sent) -- will be an issue after #3389

Lists, structs, and other nested types probably suffer from similar issues with maximum message sizes.

Of course, the smallest message possible is a single row, which can always be be significantly larger than whatever the max_flight_data_size limit is for variable length columns (e.g. several large string columns)

Describe the solution you'd like I would like to improve the situation and handle nested types and more effectively reduce the FlightDataSize

Describe alternatives you've considered

  1. One approach would be to copy the data into a new record batch, "packing" it into a brand new memory space (this is the approach I plan as a workaround in IOx) - this would result in the minimum sized flight data batches
  2. Another way might be to implement a slice / take that resulted in smaller data sizes (e.g. rewrite data offsets for strings)

Additional context See #3347

alamb commented 1 year ago

In reviewing the arrow IPC writer code, it does appear to be clever about using offsets when actually writing (thanks @viirya in https://github.com/apache/arrow-rs/pull/2040 ❤️ ) https://github.com/apache/arrow-rs/blob/acefeef1cb5698a6afe1d3061644f6276d39117c/arrow-ipc/src/writer.rs#L1094-L1260

However, I am not sure exactly how this will translate to flight data size -- I am writing some more tests now

alamb commented 1 year ago

PR with tests showing how far from optimal the current splitting logic is: https://github.com/apache/arrow-rs/pull/3481