This is the second part of the auto-configuration changes. It doesn't change the behavior for any explicitly set value, but the new defaults lead to the following:
Threads are set according to the backend specified value, plus 1 for non-cpu backends.
The backends request 1 thread except for cuda with multi-stream, multiplexing and round-robin.
The task workers are set to 0 for cpu backends, otherwise to cpu_cores/(threads-1) up to a max of 4.
The multiplexing and roundrobin backends suggest for minibatch the min of the values of the used backends. For the demux backend this is multiplied by the number of (backend) threads.
This is changing the default for cpu backends to 1 thread and may reduce task-workers (apparently this is generally expected of cpu engines). Multiplexing backends should now get a decent starting configuration.
This is the second part of the auto-configuration changes. It doesn't change the behavior for any explicitly set value, but the new defaults lead to the following:
This is changing the default for cpu backends to 1 thread and may reduce task-workers (apparently this is generally expected of cpu engines). Multiplexing backends should now get a decent starting configuration.