koka-lang / koka

Koka language compiler and interpreter
http://koka-lang.org
Other
3.25k stars 160 forks source link

What's the plan for thread model, async runtime and data containers? #98

Open SchrodingerZhu opened 3 years ago

SchrodingerZhu commented 3 years ago

I am interested in the implementation plan for thread model, async runtime and some data containers (like map).

Is the plan is to implement one in the backend and then port the interfaces to the language or just to write these in pure koka?

Some of my considerations:

daanx commented 3 years ago

Hi @SchrodingerZhu -- thanks for your interest :-)

Now with v2 coming to fruition, these are indeed high on the list. For async, we need:

  1. First we need a better way to integrate external C libraries into Koka. I am currently working on this for the std/text/regex library that I plan to use the PCRE library for. This should be done soon.

  2. With that in place, the plan is to integrate libuv (as I am very familiar with it) as the driver for std/async -- I plan to use the v1 std/async API basically unchanged but with a C primitives that use libuv.

  3. I think 1&2 should probably be done by me as it is quite tricky I guess. But once that is in place, it would be great to have help to cover the libuv API surface in Koka, starting with async timers, std/async/file, and a way to write http servers. :-)

For a map we need a bit more design, as specialized intmap / stringmap may perform better. When I wrote the Haskell standard Data.Map/Data.IntMap I used size balanced trees for regular maps but integer Patricia trees for intmaps. Both of these are very good for functional style datastructures that support persistence (ie. shared trees) -- this is where hash tables usually do not work well at all.

We should definitely implement these purely in Koka -- please read the recent Perceus paper, it has a really interesting section on FBIP, functional but in-place . Given the performance of the red-black tree balanced insertion example I am sure we can get very close (or better!) then trying to do this in C. For a BTree we may need to do some work to make vector update use in-place update when it is unique but otherwise this should work also really well in Koka.

Finally, for threads: the C support library is already supporting multiple threads of control (and Koka runs as if there could be multiple threads) but there are no primitives exposed for using threads. However, already, a ref<h,a>, a vector, array, and var all will use atomic instructions if needed, since each datatype in the Koka heap will have its thread_shared bit set if it is shared between threads. So, the plan is to not have atomics exposed in the language; moreover there would be no need for a memory model (except for sequential consistency between refs).

Personally, I envision std/async for anything concurrent that is I/O bound. For CPU parallelism I would like to see a safe API for running pure functions in parallel mapping over data structures. For that we only need light weight "futures". The plan is to still expose raw threads and mutexes, but really only use that to implement a small task/future scheduler in Koka, and use those to create nice abtractions like par(e1,e2), and parmap(xs,f) etc.

cimtrae commented 3 years ago

Hi Dan, On the last point - can the future scheduler be preemptive as well (i.e. yield not only on io effects but also on time slice) ? This will be similar to what Erlang does and is very powerful where one busy coroutine doesn't completely take over one core. Doesn't seem to be possible in may program languages including Rust.

exebook commented 3 years ago

have help to cover the libuv API surface in Koka, starting with async timers

If you still need cannon fodder for this I'd love help.