Currently CPUBackend is not NUMA aware, so Etaler might run slower on a NUMA system (Multi-socket servers/ AMD 1st gen EPYC/ AMD ThreadRipper). Binding threads on NUMA nodes. allocating memory on NUMA local memory and running separate backends on NUMA nodes should allow Etaler to run fast on NUMA systems.
Yet since TBB is not NUMA aware. So we might need to drop TBB to support NUMA.
Currently CPUBackend is not NUMA aware, so Etaler might run slower on a NUMA system (Multi-socket servers/ AMD 1st gen EPYC/ AMD ThreadRipper). Binding threads on NUMA nodes. allocating memory on NUMA local memory and running separate backends on NUMA nodes should allow Etaler to run fast on NUMA systems.
Yet since TBB is not NUMA aware. So we might need to drop TBB to support NUMA.