vasi / pixz

Parallel, indexed xz compressor
BSD 2-Clause "Simplified" License
711 stars 61 forks source link

Server mode #86

Open nathell opened 4 years ago

nathell commented 4 years ago

This is more of a loose idea than an actual feature request, so feel free to close if it's out of scope.

I'd like to use pixz as a kind of read-only key-value store, backed by a pixz-compressed tar file. The "keys" are filenames of archive entries, and "values" are contents of the files.

Currently, I'm just exec'ing a shell pipeline (pixz -x path/to/file < input.tpxz | tar x) for every key I need to retrieve. This is very slow; I haven't measured, but I expect much of the slowdown to be due to the overhead of launching two processes and pixz reading the index over and over again.

So I envision that pixz might run in a "server mode". You could say

pixz --server input.tpxz --port 8989

and pixz would launch a server on that port (HTTP? some simpler protocol?) that would keep running, listen for filenames and reply with the contents of those names.

I'm guessing this change, if implemented, would be big enough to warrant a separate piece of software. I intend to fork pixz and take a stab at it, but I'd like to gauge whether there's interest in merging it back upstream.

vasi commented 4 years ago

That's a cool idea! As a whole it feels a bit specialized, maybe awkward for upstreaming into pixz. But you could take pixz and factor out a libpixz for things like:

That'd be welcome upstream as long as there's some basic tests, and should make it pretty easy to build your "server mode" tool.

By the way, I'm curious what your use case is. Did you consider using something like squashfs instead?