ponylang / rfcs

RFCs for changes to Pony
https://ponylang.io/
59 stars 48 forks source link

Add asynchronous File IO #128

Open mfelsche opened 6 years ago

mfelsche commented 6 years ago

This issue tries to spark the discussion around 1. the need for asynchronous file IO, 2. The possible implementations thereof and 3. The new look and feel of such an asynchronous file API for pony. The new asynchronous file IO could be added alongside the existing blocking file io apis.

  1. Current File operations in Pony use standard POSIX file operations like write/writev, read etc. which are all possibly blocking. This means that on performing such an operation on a file, one scheduler thread will be blocked during that operation. This can be a great performance problem. This is the reason I am bringing this up.

  2. This is the actually tricky part. Afaik ASIO which is used for all other networking, pipe, stdstream IO will not work on regular files. Winfows has some kind of asynchronous file IO which i know nothing about, if anyone could shed some light on this, that would be great. Posix offers the aio_* apis, basically offloading file IO to a separate threadpool in userland. This API, i think, is a good candidate due to cross-platform compatibility. Another one would be libuv which is completely cross platform and offers async name resolution as well. It does file io in a conceptually similar manner than the aio api such that it uses blocking file apis but executed them on a separate threadpool. It seems a bit overkill for the problem at hand and possibly it makes most sense to completely move all io operations to libuv instead of adding it alongside asio.

mfelsche commented 6 years ago
  1. Would it make sense to roll our own threadpool for blocking io operations and integrate it into the existing asio implementation. That would e.g. mean we register an asio event, read from a file on the threadpool, when the data is there we send it using an asio event from the threadpool to the pony schedulers. The reason i am suggesting it is that we most likely have hard performance constraints that other libs might not satisfy. And it might be the quickest to do, given we get the threadpool right. ( What could go wrong? ;-))
jemc commented 6 years ago

We've talked about libuv in the past - the consensus at the time was that adapting libuv to our purposes would be more hassle than help. Maybe we can discuss it again though, if it would be helpful.

SeanTAllen commented 6 years ago

I do not think libuv is the right approach for us. I think something along the lines of Erlang's "dirty schedulers" would be the correct approach. I reserve the right to change my opinion later.

mfelsche commented 6 years ago

@SeanTAllen or @slfritchie could you elaborate on the concept of dirty schedulers? Would that basically mean, we flag behaviours based on whether they do blocking IO and depending on that we schedule them on a special scheduler pool? Advantage here would be, we could keep the blocking APIs synchronous, thus simple (e.g. like the current files API). Would that actually be the case?

slfritchie commented 6 years ago

The Erlang BEAM VM scheduler differs from Pony's in a couple of significant ways: BEAM's is preemptive and BEAM's avoids using wall clock time (or any other traditional notion of time) when making preemption decisions.

Preemption can be triggered by: a). reduction count (roughly equivalent to function call count), VM internal trap, or blocked message receive (mailbox is empty or selective receive pattern match fails on all queued messages).

The addition of NIFs (native implemented functions), which are written in C but appear to the Erlang programmer to be Erlang, can cause a big problem with the reduction count method. Steve Vinoski was a primary author of the NIF scheme. In https://github.com/vinoski/bitwise/blob/master/vinoski-schedulers.pdf notes a problem with a NIF that implements an XOR function:

That causes all kinds of havoc with the schedulers. It's more "hilarious"(*) when schedulers start going to sleep due to mis-counting of reductions and then never bother waking up, despite huge demand to schedule runnable processes. Note also that performing I/O isn't necessary: anything that blocks a return of control to the scheduler is fair game, including XOR calculations on GBytes of data or simply calling sleep(3).

Nowadays, a NIF can have metadata associated with it to mark it as "dirty". Execution of dirty NIFs are transferred over to a dedicated set of Pthreads, the dirty thread pool. There's a non-zero overhead for switching threads, naturally, but it's far better than angering the usual schedulers' way of doing things.

With the Pony runtime's cooperative scheduling approach, I'm not aware of too many choices. One would be to always run an actor that might block the Pthread to run via a separate Pthread pool. Another is a message-passing approach: send a message to a dedicated thread or thread pool that executes the desired operation and then sends the result back. The latter is the method that Erlang's original file I/O subsystem operated, but I see no easy way to fit that scheme into Pony's runtime today without lots of other side-effects and consequences.

@mfelsche's idea of using the separate pool only for behaviors that are "known" to do blocking stuff. I hadn't thought that of that, silly me. It's a nifty idea and probably deserves a lot more pondering.

BEAM references for the curious:

(*) Where "hilarious" means "terrible things happen at weird times or the worst possible high-demand times".

SeanTAllen commented 6 years ago

Leaving aside the "how do we know something will block". I think what we would want is...

Svenskunganka commented 5 years ago

I'm sure some of you have heard about the new asynchronous I/O interface in Linux 5.1, io_uring, but I thought I'd leave a note about it here nonetheless.

Here's a document that goes into detail about the new interface: http://kernel.dk/io_uring.pdf
And here's a good LWN article about it: https://lwn.net/Articles/776703/

Under section 3.0 - New interface design goals in the document:

  • Extendable. While my background is mostly storage related, I wanted the interface to be usable for more than just block oriented IO. That meant networking and non-block storage interfaces that may be coming down the line. [...].

It sounds like in the future, the interface may support asynchronous network I/O as well.

On Windows, IOCP exists for asynchronous I/O.