axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.85k stars 402 forks source link

CQE overflow when sendmsg_zc + UDP GSO #1055

Closed pyhd closed 8 months ago

pyhd commented 8 months ago

Experimented it with different CQE sizes, although large enough, overflow still flooded. I guess that's why the thoughput was lower than the regular copy method.

Setup: IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_COOP_TASKRUN | IORING_SETUP_DEFER_TASKRUN io_uring_register_ring_fd and io_uring_register_files 4 UDP GSO buffers(64K each)

Bug functions:

  1. __io_cqring_overflow_flush before io_submit_sqes
  2. __io_submit_flush_completions -> io_req_cqe_overflow before io_issue_sqe
axboe commented 8 months ago

Can you please be a bit more specific? What's being run, and what do you mean by "bug functions"? It's easier if we don't have to guess as what you mean, just be explicit in what you think is wrong (eg "expected behavior").

As always, a reproducer is worth a thousand words as it covers pretty much everything.

pyhd commented 8 months ago

Sorry for my words.

io_uring_prep_recv_multishot(main_ring);
while() {
    io_uring_wait_cqe(main_ring);
    while (io_uring_peek_cqe() == 0) {
        reap & io_uring_prep_sendmsg_zc(send_ring);
        io_uring_cqe_seen(main_ring);
    }
    nr = io_uring_sq_ready(send_ring);
    io_uring_submit_and_wait(send_ring, nr);
    io_uring_cq_advance(send_ring, nr);
}

The problem is the lower thoughput with sendmsg_zc. Is it expected that __io_cqring_overflow_flush comes before io_submit_sqes, and __io_submit_flush_completions before io_issue_sqe? Is it expected to execute io_submit_sqes together with __io_cqring_overflow_flush, as well as io_issue_sqe together with io_req_cqe_overflow? I did not find such functions with sendmsg, so I just suspect them.

isilence commented 8 months ago

__io_cqring_overflow_flush is indeed about overflowed CQEs and it's expensive. So if you see it the solution would be to size CQ appropriately.

In terms of overflows multishot receives are usually more of a hazard because sends are more predictable, i.e. 2 CQEs per send. And I have no clue how GSO is at play here, as it should be reducing the total number of CQES.

pyhd commented 8 months ago

In terms of overflows multishot receives are usually more of a hazard because sends are more predictable, i.e. 2 CQEs per send. And I have no clue how GSO is at play here, as it should be reducing the total number of CQES.

I found the culprit:

After io_uring_submit_and_wait, io_uring_cq_ready was larger than submitted sqe numbers (i.e. io_uring_sq_ready). In addition, none of their cqe->res was negative.

Edit: As I mentioned in the pseudo code, the sendmsg_zc was submitted to a dedicated ring.

pyhd commented 8 months ago
if (!(cqe->flags & IORING_CQE_F_NOTIF)) {
    if (cqe->flags & IORING_CQE_F_MORE)
        nr_cqes++;
}

Fixed. I just missed the man page about IORING_OP_SEND_ZC.

Anyway really appreciate for your help.