xiaodaigh / JDF.jl

Julia DataFrames serialization format
MIT License
88 stars 9 forks source link

Implement a pipeline system for compression #10

Open xiaodaigh opened 4 years ago

xiaodaigh commented 4 years ago

Thinking about JDF it can actually be thought of as a pipeline

raw data -> blosc compressed/rle compressed -> written

Conceivably, we can have more elaborate pipelines like

raw data -> rle compress -> blosc compress-> written

and before we start, we do not know which compression is better e.g.

raw string data -> string array -> blosc compress -> written or raw string data -> rle compress -> blosc compress -> written

so we can have this comparison pipeline concept. So we can compare both compress before writing to disk, but all we have to do is to remember the operations in a pipeline and and then retry them.