Closed andirey closed 4 years ago
I'd suggest using Parquet files for this use case. We eventually will have some compression options with Feather but it's not a short term priority (and no one is paying for this work to be done)
@wesm Thanks for the advice. I consider "fst" package and format as the most feasible alternative for compressing files, and it also shows the faster speed of w/r operations. Pitty, because I love feather and used it many times.
I did my own investigations into this and found mixed results
We've implemented lz4 and zstd compression with "Feather V2" coming in the next Apache Arrow release
Wow! Is it possible to test now? How I can handle existed data in feather format with a new compressed one? Any changes in "feather_read/write" functions? Great news, thanks a lot!
You can install a nightly arrow build and try it out
https://github.com/apache/arrow/blob/master/r/README.md#installing-a-development-version
I need to save feather data in a more compact format. After making several tests I found that after using the "zip" function the size of the "feather" data file reduce up to 90%.
So, there is a simple question - does anyway to save data as a "feather" file with let say argument like "compressed = zip" to save disk space.
library(feather) library(zip)
Write data in feather format
write_feather(df, "data.feather)
Write data in zip feather format
zip("data.feather.zip", "data.feather")
How to have commands like these ones without using disk space for temp file ?
write_feather(df, "data.feather.zip", format = "zip") df <- read_feather("data.feather.zip", format = "zip")
Thanks!