spdk / spdk.github.io

SPDK organization web pages
8 stars 14 forks source link

NUMA #20

Closed jimharris closed 2 weeks ago

jimharris commented 4 weeks ago

Currently SPDK does not try to do NUMA local allocations.

We currently have iobuf which consolidates a lot of the memory pools in SPDK. It seems relatively straightforward to have multiple pools, one for each NUMA node, and then do the allocations based on the NUMA node of the calling spdk_thread. We should be able to free the buffer to the correct pool based on the NUMA node of the calling spdk_thread.

Some complications: 1) dynamic scheduler - ideally we don't move an spdk_thread from one NUMA node to another - but what if that spdk_thread is completely idle? 2) validation that buffers are freed to correct pool - i.e. today it is perfectly valid for one spdk_thread to allocate a buffer and another spdk_thread to free it - we want to keep this behavior, but need to make sure that if spdk_thread A on numa node 0 allocates buffer, and spdk_thread B on numa node 1 frees the buffer, that the buffer goes back into the numa node 0 pool

tomzawadzki commented 2 weeks ago

[45min]

tomzawadzki commented 2 weeks ago

@jimharris has given a presentation on the proposal to make the memory allocation in SPDK numa aware, with gradual changes that could be made to switch over from current design. There will be changes required to reactor/dynamic scheduler to make sure that threads are not moved between numa nodes.

One observation made to double check was to verify how DPDK currently behaves when ANY numa is passed - is current one used ?

Another concern was selecting the numa to use, the NIC or NVMe ?

@jimharris could you please attach the slides ?