Open Max-Meldrum opened 2 years ago
The other approach is to adopt tokio and use a similar approach to the actix runtime where they spawn a runtime per thread and combine it with a LocalSet to support !Send
futures.
Going with tokio as the default kernel runtime is a bit safer and makes it easier to contribute but also play around with Arcon. I believe it should be possible to support glommio later on with a glommio_rt
feature flag.
I wonder if we could combine kompact and tokio to get "the best of both worlds". Tokio integrates us with the async ecosystem which I think is crucial for streaming applications. Kompact on the other hand gives us networking (while tokio only gives us sockets).
If it's possible, I think a nice solution could be to implement some type of network channel using kompact which tokio tasks on different machines can use for communication.
@Max-Meldrum although distributed Arcon is not currently planned, what do you think?
I wonder if we could combine kompact and tokio to get "the best of both worlds". Tokio integrates us with the async ecosystem which I think is crucial for streaming applications. Kompact on the other hand gives us networking (while tokio only gives us sockets).
If it's possible, I think a nice solution could be to implement some type of network channel using kompact which tokio tasks on different machines can use for communication.
@Max-Meldrum although distributed Arcon is not currently planned, what do you think?
yeah, worth seeing if it's possible to combine or perhaps reuse implementation parts of the networking.
Moving forward we will need a stream
kernel
for the data processing layer that is specifically designed for Arcon. This issue serves as a direction (I.e not prioritised in short term).Related issues:
https://github.com/cda-group/arcon/issues/277 https://github.com/cda-group/arcon/issues/246 https://github.com/cda-group/arcon/issues/214
Kernel
The Kernel represents an application-level OS that manages Task scheduling, memory management, and I/O. The idea is to cooperatively schedule a set of tasks on a single core in order to get better CPU utilisation + locality between tasks + avoid context switches. As noted here, storage and networking are no longer the bottlenecks in a modern data center, but CPUs are.
Running a Thread-Per-Core model with cooperative scheduling is not something unique itself. Down below are some data-parallel systems that execute it with success.
Rough Overview Sketch
The Kernel has the following context that is shared between tasks that it executes:
Task
A cooperative async long-running future that drives the execution of a dataflow node. A Task must always check if it should yield back control to other tasks in order to not block progress.
Tasks may send their output in 3 different ways:
Rc<RefCell<Vec<T>>>
Application-level Task Scheduling
API levels
Suggested by @segeljakt
High-level API: builtin operators (map, filter, window, keyby) Mid-level API: operator constructors + event handlers
Low-level API: tasks/async functions + channels
Async-friendly Runtime
Currently, it is hard to support
async
interfaces/crates. Two prime examples are source implementations and supporting state backends that are async.Glommio
Glommio is a Seastar inspired TPC cooperative threading framework built in Rust. It relies on Linux and its
io_uring
functionality. This is the only notable downside of adopting Gloomio for Arcon. That is, making it a Linux only system. But then again, data-intensive systems such as Arcon are supposed to run on Linux anyway.Another downside is that Glommio runs on the assumption that the machine has NVMe storage. Specific OS + Hardware requirements will make it harder to run or to contribute to Arcon.
Pros:
io_uring
and Direct I/O support (future backend)Cons:
tokio
Article about Gloomio may be found here.
Other Async runtime candidates