Open vsbenas opened 5 years ago
Thank you for the pull request!
Is it good enough to instead allow each Rpc object to choose its NUMA node, instead of inheriting the Nexus's NUMA node? That can be easily done by adding an optional argument to the Rpc constructor.
The ability to work without hugepages is nice, but I would like to avoid the additional complexity unless we really need it.
That depends if Rpc objects register the node's memory with the NIC. I could not create a Nexus using the second numa node Failed to register mr.
It is not connected to the network.
I don't know if such a scenario is at all common (two nodes, only one on the network), but the performance is much better using regular memory in our case. So "do we need it" really depends on how common such setup is.
About the complexity, it adds an extra branch inside the ~HugeAlloc()
loop and one branch in HugeAlloc()
. In terms of performance it should be negligible, but I understand that the code becomes more cluttered.
Thanks for the details. It's common to have only one CPU connected to the network, so it's important that eRPC works in this setting.
I'm unsure why registration fails with the second NUMA node. I'll look at this over the next few days.
Our machines have 2 NUMA nodes, but only one is connected to the network. Hence, running eRPC on half of the cores is efficient, but the other half experience significant performance issues.
This pull request is inspired by the HERD architecture to use heap memory, when numa_node is set to -1. https://github.com/efficient/rdma_bench/blob/master/libhrd/hrd_conn.c#L117
It is now possible to set numa_node as
erpc::kNoNumaNode
on the Nexus constructor so that NUMA memory is not used on eRPC. Obviously this ends up being slightly slower for when such nodes are available, but in our configuration this achieved a20.6%23.1% performance increase, so I believe it's a good option to have for eRPC.TODO:
Scaling up to kHugePageSize is unnecessary in this configuration, so to save memory it might be useful to avoid it.Update: It is still better to scale up in cases where buffer size varies between requests due to eRPC reusing buffers. This increased performance by 3%.Add a test for this configuration.