Open bmerry opened 8 years ago
There is also some work that can be done on sending. The biggest change here is simply to batch up more packets together, to amortize the various overheads. But asking for only the last packet in each batch to be acknowledged may also help.
The send batching has long since been implemented, and 3.0 will implement the single completion event per batch. Newer versions of rdma-core offer some other opportunities for optimisation:
Things that documentation or common sense suggests might improve performance with MLNX_OFED