Adds support for multithreading to model inference with the help of rayon. Both run and run_for can be run in threaded mode by enabling the threaded feature for cervo-runtime.
For run_threaded(), all models that need to be executed are being run, while run_for_threaded() first prepares the models that need to be executed in advance on a single thread, and then iterates over them until duration has expired.
This PR also adds benchmarks that measure the speedup for different batch_sizes, model sizes and quantities, as well as documentation regarding that in /docs/.
Adds support for multithreading to model inference with the help of rayon. Both
run
andrun_for
can be run in threaded mode by enabling thethreaded
feature forcervo-runtime
. Forrun_threaded()
, all models that need to be executed are being run, whilerun_for_threaded()
first prepares the models that need to be executed in advance on a single thread, and then iterates over them untilduration
has expired.This PR also adds benchmarks that measure the speedup for different batch_sizes, model sizes and quantities, as well as documentation regarding that in
/docs/
.