axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.83k stars 402 forks source link

Writes do not advance the file offset #1115

Closed lihaohong6 closed 6 months ago

lihaohong6 commented 6 months ago

I'm writing some data to disk and noticed that the file offset does not increase automatically after a write request is issued. If I issue two write requests through liburing, the second one will overwrite the first one since both are trying to write to the beginning of the file. The man page says setting offset to -1 will advance the file offset automatically, but that does not seem to be the case. Using these posix apis by themselves (write and writev) do not cause any issues.

A small example is below. The expected behavior is that 4096 bytes of numbers from 0 to 1023 are written to the file, followed by 4096 bytes of 0s. However, the file only contains 4096 bytes of 0s. If I remove the line write_buffer_to_file(2), it now contains 4096 bytes of numbers, so it seems that the second call is overwriting the content of the first one. The fact that lseek always returns 0 confirms that the file offset never changes.

The code snippet is compiled with g++ and ran on RHEL 9.3 with kernel 5.14. Maybe the old kernel version (and Red Hat tinkering with the kernel) is the culprit. I skimmed through older issues and asked on Stack Overflow without success, so perhaps this is a nontrivial issue?

#include <cstring>
#include <iostream>
#include <liburing.h>
#include <unistd.h>

// a small macro to check for errors
#define SYSCALL(expr) if ((expr) < 0) { \
    perror("System call error");        \
}

const int WRITE_SIZE = 4096; // satisfy alignment requirement of O_DIRECT
int fd; // file descriptor
int *buffer; // write buffer
struct io_uring ring;

// write the content of the buffer to fd; the data argument sets user_data in the sqe, which shouldn't affect the result
void write_buffer_to_file(int data) {
    struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
    io_uring_sqe_set_data(sqe, (void*)(intptr_t)data);
    // according to the documentation, setting offset to -1 will advance the offset
    // neither 0 nor -1 work in my testing
    io_uring_prep_write(sqe, fd, buffer, WRITE_SIZE, -1);
    SYSCALL(io_uring_submit(&ring))
    std::cout << "Submitted " << sqe->user_data << std::endl;

    // now wait for it to complete
    struct io_uring_cqe *cqe;
    SYSCALL(io_uring_wait_cqe(&ring, &cqe));
    if (cqe->res < 0) {
        perror("cqe res less than 0");
        std::cerr << std::strerror(-cqe->res) << std::endl;
    }
    io_uring_cqe_seen(&ring, cqe);
    std::cout << "Reaped " << io_uring_cqe_get_data(cqe) << std::endl;
    // this line always prints 0 even though it's supposed to increase by 4096 each time
    std::cout << "Current offset: " << lseek(fd, 0, SEEK_CUR) << std::endl;
}

int main() {
    // set up the file and the write buffer
    fd = open("test_file", O_CREAT | O_WRONLY | O_DIRECT, 0744);
    SYSCALL(fd);
    // O_DIRECT has stricter memory alignment requirements
    posix_memalign((void**)&buffer, 512, WRITE_SIZE);
    for (int i = 0; i < WRITE_SIZE / sizeof(int); i++) {
        buffer[i] = i;
    }

    io_uring_queue_init(5, &ring, 0);
    write_buffer_to_file(1);

    // set everything in the buffer to 0 and then write again
    memset(buffer, 0, WRITE_SIZE);
    write_buffer_to_file(2);

    io_uring_queue_exit(&ring);

    close(fd);
    return 0;
}
krisman commented 6 months ago

you should report it to the distro. That's quite an old kernel and this is fixed upstream. your code works fine on a newer kernel.

lihaohong6 commented 6 months ago

Thanks! I guess switching to Ubuntu might be the easier solution then.

Also, in an effort to replicate this issue on a different machine, I tried it on Ubuntu 22.04.3 LTS with kernel version 6.2. Strangely, io_uring_get_sqe always returns NULL even with a properly initialized ring. If I follow the build instructions for liburing, the programs in the examples directory can be compiled with make and ran properly, but when I try to compile io_uring-test.c myself, the resulting binary is unable to read anything because, like other programs on that machine, io_uring_get_sqe always returns NULL. The command used in compilation is

gcc -o a -g -O2 -Wall -D_GNU_SOURCE io_uring-test.c -luring

I'm sorry this is off-topic for this issue, but I could not figure out what's going on. Surely there's an obvious problem I overlooked.

krisman commented 6 months ago

Any chance you have an older version of liburing also installed on this system? try running with

LD_LIBRARY_PATH= ./a

lihaohong6 commented 6 months ago

While setting LD_LIBRARY_PATH= does not fix the problem, I installed liburing-dev via the package manager and the program now works: both io_uring_get_sqe and the file offset are functioning as expected on the Ubuntu machine. Thanks for your help!