axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.72k stars 393 forks source link

writev(), IORING_OP_SPLICE and TCP_CORK issues #1098

Closed markpapadakis closed 4 months ago

markpapadakis commented 4 months ago

Suppose that a sender:

  1. set IPPROTO_TCP/TCP_CORK on a socket FD fd1
  2. writev() 2 bytes to fd1
  3. use io_uring_prep_splice() /w SPLICE_F_MOVE | SPLICE_F_NONBLOCK to copy/move some(say all but 1) byte in a pipe to fd1 ( bytesout 1)
  4. unset PPROTO_TCP/TCP_CORK on fd1
  5. io_uring_submit()
  6. use io_uring_prep_splice() /w SPLICE_F_MOVE | SPLICE_F_NONBLOCK to copy/move all remaining bytes in the pipe (say, 1 byte) to fd1 (bytes_out_2)
  7. io_uring_submit()

the receiver then reads()s:

  1. 2 + bytes_out_1 bytes from read() ( read1 )
  2. 2 + bytes_out_2 bytes from read() ( read2 )

Effectively, the 2 extra bytes received in read2 are those that were writev()-ed initially and were already received in read1. Somehow those are duplicated in the kernel (or it looks like they are).

So it only fails if the above scenario plays out.

markpapadakis commented 4 months ago

UPDATE: use of SPLICE_F_MOVE makes no difference; fails either way.

markpapadakis commented 4 months ago

UPDATE: turns out this is not io-uring specific; using writev()/splice() directly results in the same issue. Maybe its a kernel issue, or maybe I am ignoring some specifics related to the use of corks and splice.

axboe commented 4 months ago

Closing this one, as it's not a liburing/io_uring issue. You may want to write a test case and report it to the linux kernel mailing list. Though I'm unsure of how maintained splice is these days...