axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.83k stars 402 forks source link

Contention in io_buffer_select when IOSQE_BUFFER_SELECT and IOSQE_ASYNC are used together #669

Open jrudolph opened 2 years ago

jrudolph commented 2 years ago

I'm trying out io_uring and am testing different ways of submitting requests. My test is a simple webserver-like application that accepts multiple sockets, and then alternatively reads and writes each socket. Everything is running on a single application thread within a single ring.

In general, using IOSQE_ASYNC does not seem to make too much sense for network reads because it often does strictly more work than using the default path. On the other hand, for a single threaded server, much CPU time will be spent inside of the kernel TCP stack, so using IOSQE_ASYNC could help by freeing the application thread for other work while the kernel threads do all the heavy lifting.

Looking into the performance with Linux 5.19.11 I noticed that the flamegraph shows lots of time spent in allocating buffers from the provided buffers:

image

Zooming in on io_read:

image

This is with max 128 concurrent reads. It seems in that scenario the amount of concurrent wqe_workers gets quite high (maybe even 1 per requests?), so if there's a mutex in the buffer selection path that cannot work well if many or all of the sockets are readable at the same time.

Is this contention expected and should be documented?

axboe commented 2 years ago

I would not recommend using provided buffers with IOSQE_ASYNC, as you have noticed they need to serialize with the ring mutex. This is generally not a concern, but it does certainly become one if you have a lot of io-wq activity due to marking the SQEs async. You'll be better off setting aside some threads in userspace, each with a ring, and using provided buffers with those.

In general, IOSQE_ASYNC isn't very efficient and should be avoided for most use cases.

jrudolph commented 2 years ago

Thanks for the quick answer. I agree, there are good alternatives.