facebook / zstd

Zstandard - Fast real-time compression algorithm
http://www.zstd.net
Other
23.16k stars 2.06k forks source link

GNU Jobserver support #3317

Open haampie opened 1 year ago

haampie commented 1 year ago

Is your feature request related to a problem? Please describe.

When running multiple zstd processes in parallel, it's difficult to know how many threads to give to each process. Giving each process as many threads as available results in oversubscription, given them only 1 may not be optimal use of resources.

Describe the solution you'd like

GNU Make has this concept of a jobserver, which ensures that make -j <n> doesn't run more than n jobs in parallel (recursively). It is widely supported (e.g. gcc's LTO spawns parallel jobs from gcc respecting the job server, clang, cargo, ...) and is trivial to implement: https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html.

This way, make -j n can be used to control the concurrency, and any zstd process could automatically take as many jobs as available to compress / decompress.

Additional context

This is particularly useful when Makefiles (or any other program conforming to the GNU Make jobserver) are used to create and compress multiple tarballs.

gcflymoto commented 1 year ago

@haampie Could this be handled by a generic wrapper around ZSTD which implements the observer protocol and executes zstd appropriately under the hood?

Cyan4973 commented 1 year ago

At API integration level, libzstd offers a concept of shared threadpool, which makes it possible to limit the total nb of threads used for compression, whatever the nb of concurrent requests.

However, if your question is about invoking the zstd CLI, then indeed, it seems the solution should rather come from a wrapper above zstd, which would control the total nb of instances active at the same time (like make controls the nb of active gcc instances).

That being said, it seems you are looking for something even more complex, which would not only control the total nb of instances active at the same time, but also allow each instance to use multiple threads, dynamically resized depending on workload, and yet ensure that the total nb of threads across these instances doesn't overwhelm a threshold.

First, let's note that this is not a request that GNU Make could achieve. It only controls a total nb of instances, with the naive assumption that each instance only uses one thread. That limited version is likely achievable, using an intermediate controller program (or python script).

Now, ensuring a total nb of active threads across multiple workloads is something that the shared threadpool model mentioned above could do. But then again, it's a totally different integration model, unsuitable for the command line interface experience.

It seems you are looking for a fairly advanced functionality, and while it can be imagined and even done, it still has to be done.

That's a non-trivial amount of work that someone would have to put in.