seanmonstar / futures-fs

Access File System operations off-thread, using a Futures.
https://docs.rs/futures-fs
Apache License 2.0
68 stars 11 forks source link

Impl Asynchronous Read/Write Functions #1

Open ghost opened 7 years ago

ghost commented 7 years ago

This library just uses std::fs::File;

However, the read function is implemented here on unix, and here on Windows, and they are all synchronous functions, not asynchronous functions. I understand the model of current library is trying to use one thread waiting for a small block I/O (4096 bytes) to finish, but it still takes some CPU resources to idle.

Here's the impl of read functions on Unix and Windows:

pub fn read(&self, buf: &mut [u8]) -> io::Result<usize> {
        let ret = cvt(unsafe {
            libc::read(self.fd,
                       buf.as_mut_ptr() as *mut c_void,
                       cmp::min(buf.len(), max_len()))
        })?;
        Ok(ret as usize)
}
pub fn read(&self, buf: &mut [u8]) -> io::Result<usize> {
        let mut read = 0;
        let len = cmp::min(buf.len(), <c::DWORD>::max_value() as usize) as c::DWORD;
        let res = cvt(unsafe {
            c::ReadFile(self.0, buf.as_mut_ptr() as c::LPVOID,
                        len, &mut read, ptr::null_mut())
        });

        match res {
            Ok(_) => Ok(read as usize),

            // The special treatment of BrokenPipe is to deal with Windows
            // pipe semantics, which yields this error when *reading* from
            // a pipe after the other end has closed; we interpret that as
            // EOF on the pipe.
            Err(ref e) if e.kind() == ErrorKind::BrokenPipe => Ok(0),

            Err(e) => Err(e)
        }
}

Happy to hear from you. :-)

seanmonstar commented 7 years ago

I'm not quite sure what you wanted to say with this issue. File IO pretty much only exists as synchronous operations, and so it's usually done in a thread pool, as this library does.

As for idle threads, in the CpuPool they are actually parked in the OS. They do absolutely nothing until a signal from another thread alerts the OS to wake it back up.

ghost commented 7 years ago

Maybe that's my problem of the description.

File IO pretty much only exists as synchronous operations, and so it's usually done in a thread pool, as this library does.

Yes, file I/O is synchronous operation in Rust. However, the operating system supports native asynchronous solutions. Since the tokio-core (AKA, futures-io) implemented the sockets with non-blocking model (well, still not asynchronous), I think this project may need to implement at least non-blocking level I/O, for example, use ReadFileEx with a callback function instead of ReadFile and wait for operation finished.

As for idle threads, in the CpuPool they are actually parked in the OS. They do absolutely nothing until a signal from another thread alerts the OS to wake it back up.

Yes, in most situations, that won't be a problem. And your library works well. When I say "idle", I mean "CPU is waiting for the I/O". When we call std::fs::File::read, actually current thread is blocked, and won't continue to do other things until the I/O is completed. If we have a heavy load I/O, then the performance will be much worst than a non-blocking or an asynchronous I/O.

Because this library's target is to provide futures like API, I think maybe we can do better to provide a non-blocking or an asynchronous file I/O implementation.

I am willing to contribute, too. :-D

seanmonstar commented 7 years ago

Oh OK, I see what you mean. You mean on Windows, make use of IOCP for asynchronous file operations.

I've actually never done any Windows specific programming, so my experience is limited to reading docs. The docs of IOCP sound like it's basically an OS managed thread pool. From reading, it sounds like the internal threads would still end up blocked on the File IO. Maybe an IOCP implementation does something more that I don't know of. Maybe since its in the OS/kernel, it can do it faster than in userland. If it is noticeably faster, it could make sense to use it in this crate, but I can't say from personal experience.

abonander commented 6 years ago

IOCP more seems to be a specialized message queue for I/O results that supports a large number of readers; overlapped I/O can instead notify of completion to a Windows event loop handle supplied in the OVERLAPPED struct passed to any supporting file operation, which more fits the notification/event-loop architecture of futures.