Closed Fonsan closed 8 years ago
Perhaps it would be enough to just add a tip in the readme since the msgpack protocol is a bit more verbose than protobuf
I agree with adding documentation to the README. As you can see in my default processing chain I have about 10 processing stages to stream the collections through -- gzipping/ungzipping each stage isn't needed and only useful before writing to disk.
@adamfranco What are your thoughts on moving to a
.msgpack.gz
standard from the current.msgpack
standard. Here are some example resultsWe could easily abstract the reading and writing of gzipped messagepacked data into a file that would define our protocol and could be used throughout the project or it could be up to the user to add
| gzip
in their command chain and not take that design decision for them, I am leaning towards| gzip
and keeping the current python calls directly to MessagePack but using| gzip
andgunzip
myself