chriskohlhoff / asio

Asio C++ Library
http://think-async.com/Asio
4.85k stars 1.2k forks source link

Combining stream_file::seek() and stream_file::read_some() behaves unexpectedly on Windows #1222

Open cbrl opened 1 year ago

cbrl commented 1 year ago

I'm having a problem with stream_file when trying to read some data, then skip a section of a file. The basic pattern looks like this:

while (...) {
    asio::read(file, asio::buffer(buf));
    file.seek(N, asio::file_base::seek_cur);
}

The problem is that the file position for calls to read() and seek() seem to be separated to a degree. Two calls to seek(N, seek_cur) behave as if the file position had not changed from the previous call to seek(), regardless of the presence of any number of read operations between the two calls.

For example, this bit of code

auto context = asio::io_context{};
auto file = asio::stream_file{context, "file.bin"};
auto buf = std::array<char, 4>{};

while (true) {
    std::cout << file.seek(0, asio::file_base::seek_cur) << std::endl;
    asio::read(file, asio::buffer(buf));
    file.seek(100, asio::file_base::seek_cur);
}

would produce this output

0
100
200
300
...

but I would expect it to be this

0
104
208
312
...

If this is unintended, then I suspect the cause might be that the file pointer is not updated by a call to ReadFile for a file opened in overlapped mode, so the call to SetFilePointerEx is still working with the value set from the previous invocation. If this is actually intentional, then I believe it's quite unexpected and could stand to be explicitly mentioned in the documentation.

vinipsmaker commented 1 year ago
0
104
208
312
...

I just tried on Linux (io_uring backend), and that's the output I get. I guess that means we have a bug in the Windows's IOCP backend.

To be fair, the stream interface is somehow problematic. On the BSD world, file support can be added, but only for random_access. I'd like to send patches to the main BSDs (FreeBSD, OpenBSD, NetBSD, DragonFly BSD) so it becomes possible to implement stream for BSDs as well, but I don't know when I'll have the time.

Unfortunately random_access is a better bet for portability right now (which is a shame because the stream interface is incredibly useful and it's worth the implementation challenge/effort IMO).