CerebusOSS / ella

A streaming time-series datastore for low-latency applications
https://docs.rs/ella/
Apache License 2.0
2 stars 0 forks source link

Unable to save Duration tensors to Parquet #4

Open sydduckworth opened 1 year ago

sydduckworth commented 1 year ago

arrow-rs doesn't support converting Duration arrays to/from Parquet format. To get around this, Synapse uses a hack where Duration arrays are cast to int64 before writing to disk, and when reading from disk a Cast node is inserted into the execution plan to convert back to duration.

This works for scalar arrays, but not for Duration tensors, which are represented as fixed-size lists, since the arrow-rs cast method doesn't support fixed-sized lists.

This should be fixed upstream in the near future since there is significant work being done in datafusion to better support fixed-size lists: apache/arrow-datafusion#6560

In the meantime only scalar Duration fields are supported.