Try out further ibverbs optimisations

ska-sa / spead2

Library for the Streaming Protocol for Exchange of Astronomical Data (SPEAD)

http://spead2.readthedocs.io/en/latest/

GNU Lesser General Public License v3.0

23 stars 13 forks source link

Try out further ibverbs optimisations #44

Open bmerry opened 8 years ago

bmerry commented 8 years ago

Things that documentation or common sense suggests might improve performance with MLNX_OFED

using contiguous pages for MRs
batch up acknowledgement of events
setting environment variables to make QPs and CQs use contiguous pages
posting receives in batches (via a linked list) instead of one-at-a-time

bmerry commented 8 years ago

There is also some work that can be done on sending. The biggest change here is simply to batch up more packets together, to amortize the various overheads. But asking for only the last packet in each batch to be acknowledged may also help.

bmerry commented 3 years ago

The send batching has long since been implemented, and 3.0 will implement the single completion event per batch. Newer versions of rdma-core offer some other opportunities for optimisation:

Thread domains
APIs for creating CQs, which allow them to be declared single-threaded
APIs for CQ polling, that pull attributes of completions on-demand
APIs for submitting send work requests