Open Lcc-code opened 4 months ago
Hi, when I try this "post_wq_recvs" function, I can poll_cq successfully! I will continue to try to understand the acquisition of cqe for string RQ, which may be a bit difficult. Thank you very much.
Hi @Lcc-code,
Thanks for taking the time to use NetBlocks. What you have asked here is a very interesting question and related to how MLX5's Striding RQs work. As far as I understand rdma-core doesn't support Striding RQs. If you actually try to use the ibv_poll_cq, it works fine initially. However, when you try to read a burst of packets, it fails. The method directly reading from the buffers from the mlx5_cqe64 allows you to read a large number of posted packets. With the ibv_poll_cq method you miss packets if more than 128 are enqueued.
Also, just to add, I use this modified version or rdma-core instead of the default one - https://github.com/AjayBrahmakshatriya/rdma-core/tree/mlx5-fix-mprq-wq-post-recv
This has been retrofitted to support striding RQs.
Hi, Thank you for your reply,
I'm doing the post recv in this doorbell way
std::atomic
However, I ran into the problem that the current method works fine on a single thread, but on 2 threads or more, cqe->wqe_id and cqe->wqe_counter will produce incorrect return values when the thread receives more than recv depth. However, the above situation does not occur when a new process is started to receive packets.
Do you have a similar situation when multithreading with the current protocol stack? In addition, do you have similar experience? How can I find out?
Thanks again for your reply!
Hello, I'm sorry to bother you.
I am a student from China and recently saw your code. I would like to ask you a question:
I saw that you tried to obtain the cqe of strding RQ using ibv_poll_cq before. Can you use ibv_poll_cq to obtain the cqe normally? I have been reporting the issue of IBV_WC_LOC_PROT_ERR locally.
Additionally, it would be somewhat unusual to obtain CQE using your current method, and I am very confused :<