Closed theojepsen closed 3 years ago
The goal of the headroom is to allocate space for DMA-receiving the InfiniBand "Global Routing Header (GRH)", see Section 2.7 in https://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf.
In my experience in small networks, NICs don't actually DMA-write the GRH on receiving a RoCE UD packet, but IIRC they do complain if the RECV buffer doesn't have enough space for the GRH and payload.
I think my RoCE implementation might be inefficient because it starts the TX buffer from the first pkthdr byte, whereas (I suspect) it should start from pkthdr + 40 bytes.
Why do ROCE eRPC packets have 42 bytes of headroom? I understand how this is used for other types of transport (e.g. "raw"), but for ROCE it just seems like a waste of space. For example, when I send a 48-byte RPC, there's a 14-byte eRPC header, plus 42 bytes of headroom (totaling 104 bytes):
I see that headroom is added when you compile with ROCE enabled: https://github.com/erpc-io/eRPC/blob/master/CMakeLists.txt#L163
However, it looks like this headroom is never used in the infiniband transport. The only reference to headroom is an assertion here: https://github.com/erpc-io/eRPC/blob/master/src/transport_impl/infiniband/ib_transport.cc#L31
I commented-out that assertion and re-compiled with
set(CONFIG_HEADROOM 0)
. It doesn't seem to have affected functionality. In fact, it it seems to have dramatically reduced latency, especially for small payloads. This is the latency with 40 bytes of headroom:And after removing headroom:
For 32-byte payloads, it reduced median latency by 1us!
Is this a bug? Should ROCE packets have these extra headroom bytes?