I was reading through the code and trying out a large number of workers with a tiny model. Often I found my cpu and memory getting bottlenecked because of the large number of actors launching Python workers.
Wondering what was the motivation to not use threaded or async actors to achieve higher concurrency?
I was reading through the code and trying out a large number of workers with a tiny model. Often I found my cpu and memory getting bottlenecked because of the large number of actors launching Python workers.
Wondering what was the motivation to not use threaded or async actors to achieve higher concurrency?