tokio-rs / tokio

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
https://tokio.rs
MIT License
25.98k stars 2.38k forks source link

Feature request: read_at/write_at for tokio::fs::File #1529

Open asomers opened 4 years ago

asomers commented 4 years ago

Please add asynchronous versions of read_at and write_at for tokio_fs::File. They wouldn't be implementable for streams, but they would for regular files. Without these or similar methods, it's not possible for multiple tasks to operate on the same file simultaneously. Plus, those are the basic primitives used by AIO, a possible fully asynchronous backend for tokio_fs.

Darksonn commented 4 years ago

I would be ok with adding this, but note that it would simply schedule the operation on a thread-pool.

theli-ua commented 4 years ago

I think that would be ok as long as its called out in the docs

Darksonn commented 4 years ago

It's the same as all other file operations, so see #2700.

flakusha commented 3 years ago

Hello! It's my first attempt to contribute, so I will need some feedback if I'm trying to do the right thing.

There are read_at and read_exact_at functions for FileExt in std::os::unix::fs already in stable Rust, so solution for Unix/Linux seems an easy one:

#[cfg(unix)]
pub async fn read_at(path: impl AsRef<Path>, buf: &mut [u8], offset: u64) -> io::Result<usize> {
    let path = path.as_ref().to_owned();
    asyncify(move || std::os::unix::fs::read_at(path, buf, offset)).await
}

#[cfg(unix)]
pub async fn read_exact_at(path: impl AsRef<Path>, buf: &mut [u8], offset: u64) -> io::Result<()> {
    let path = path.as_ref().to_owned();
    asyncify(move || std::os::unix::fs::read_exact_at(path, buf, offset)).await
}

Please note these functions are used for side-effect which is not the same scenario as read.rs API.

For other OS there should be another approach. OS-agnostic version of what is needed can be found in std::io::File::read_exact(...), but as I see this function doesn't give opportunity to select offset unless logical cursor is moved into desired position first.

Waiting for your response to continue experimenting with this feature.

Best regards.

oronsh commented 2 years ago

Hey, any news on this issue please? :)

asomers commented 2 years ago

@oronsh the POSIX AIO pathway all works with Tokio 0.2. But I haven't worked out an interface that the maintainers find acceptable to use with Tokio 1.0. @Darksonn can we schedule a meeting to talk about this on Discord sometime? I am in UTC-6, and I'm usually busy most mornings until 16:00 UTC, but I'm usually available after that.

flakusha commented 2 years ago

Hello! I've been busy all these months and couldn't get one step of this feature to work locally on my machine... I've only got some spare time recently

oronsh commented 2 years ago

Thank you @flakusha @asomers . I guess the posix aio doesn't work with linux right? so it's only on mac? I looked into tokio-file crate.

asomers commented 2 years ago

Thank you @flakusha @asomers . I guess the posix aio doesn't work with linux right? so it's only on mac? I looked into tokio-file crate.

It's only for FreeBSD ATM, because only on FreeBSD does POSIX AIO work with kqueue. Mac and NetBSD implement POSIX AIO, but they can only deliver completion notification via signals. So they could do it, but it probably wouldn't be as fast as on FreeBSD, and it would monopolize a signal in the tokio-file library, preventing applications from using it.

oronsh commented 2 years ago

I see, thank you, I guess my only change getting it done is using tokio-uring which implements it or do something like this:

let mut std_file = tokio_file.into_std().await;
std_file.write_at(..);
let tokio_file = File::from_std(std_file);

Do you think something like this would work? Thank you so much for your help!

asomers commented 2 years ago

At that point, there wouldn't be any reason to use tokio-file. The ultimate goal, @oronsh should be for Tokio to include vectored methods that use std's blocking implementations by default but can switch to POSIX AIO, io-uring, or other primitives underneath. Here's an early attempt I made. It defines a new trait called FileExt that includes the _at methods, with multiple os-specific implementations. https://github.com/tokio-rs/tokio/compare/master...asomers:aio3?expand=1

oronsh commented 2 years ago

This looks great @asomers why didn't they accept it?

asomers commented 2 years ago

That branch is incomplete, @oronsh . Notice that it doesn't even mention read_at. A complete PR would have many more SLOC. My preferred strategy is to implement this stuff bit-by-bit, in stages: 1) Add hooks so tokio-file can work with Tokio 1.0. See https://github.com/tokio-rs/tokio/pull/3841 . 2) Merge mio-aio into tokio, and rebase tokio-file on top of that. 3) Add the generic thread-based _at methods to tokio 4) Specialize the _at methods using POSIX AIO 5) Add readv_at and writev_at methods

But doing it piecemeal creates many temporary APIs, and Tokio's maintainers are very concerned about API stability. So they're reluctant to accept a piecemeal approach.

Darksonn commented 2 years ago

A version of this implementing using spawn_blocking similar to the existing file methods would be fine with me. Implementations using io_uring or AIO would be a second step.

SteveLauC commented 4 months ago

Hi! I am curious is there anything blocking us from implementing these calls using spawn_blocking() as the first step?

A version of this implementing using spawn_blocking similar to the existing file methods would be fine with me. Implementations using io_uring or AIO would be a second step.

Darksonn commented 4 months ago

No, that would be fine with me.

SteveLauC commented 4 months ago

No, that would be fine with me.

Get it, I will give it a try

jumpnbrownweasel commented 1 month ago

When read_at and write_at are added, please be sure to use a &self param rather than &mut self. Among other benefits, this would allow me to create a cache of open Files that are used by multiple tasks, which is something I need for my database implementation. Using &self will also match the signatures for these methods in the std library's FileExt trait.

Darksonn commented 1 month ago

The current blocker for read_at/write_at is finding a good solution to buffer management. Tokio's File currently has a single buffer, since you don't make more than one operation at the time, but with read_at/write_at that is no longer the case.