Closed wseaton closed 9 months ago
I can see pypolars supports loading of arrow recordbatch, maybe there are companion methods in Rust core that you can use?
@andrusha you are right, if the arrow_rs
feature flag is enabled, it's possible to use some internal functions from polars-core
to get a DataFrame
. Unfortunately I don't think polars
itself has that flag on (hard to tell, the feature list is LONG), but I think it's good enough for most use-cases.
Looking a bit more into it, I can see that polars has arrow ipc deserialization support, which might allow it to decode the payload returned by snowflake directly, instead of us deserializing it first into record batch and then massaging it into polars-compatible structure. I'll see if it would be possible to expose raw bytes as well, then the loading could be made cleaner I think.
https://docs.rs/polars-io/latest/polars_io/ipc/struct.IpcReader.html https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc
It would be very nice to have a utility function that allows for transmutation of the arrow response into a polars or datafusion dataframe (or other arrow sources/sinks). May also be a good candidate for putting in a seperate utility crate.