A pipe often has a single bottleneck (usually a complex CG), so even though the pipe is multi-process, the benefit is reduced. Splitting the input in ~4 and running that many pipes can thus take full advantage of available CPUs.
This should both be per-corpus and across corpora, so runs should internally be changed to a single tasks list.
A pipe often has a single bottleneck (usually a complex CG), so even though the pipe is multi-process, the benefit is reduced. Splitting the input in ~4 and running that many pipes can thus take full advantage of available CPUs.
This should both be per-corpus and across corpora, so runs should internally be changed to a single tasks list.
(quick'n'dirty per-corpus https://github.com/TinoDidriksen/regtest/commit/c18ed0cb1fcb69ee5e38d81d4886bb0f95c8afed)