Closed jon-chuang closed 2 years ago
Hi @jon-chuang, great to see this effort! I think the community will be interested in the Ray Rust API proposal, and pros and cons for building on top of Ray C++ API vs core worker API.
Implementing Ray Java and C++ API have been large undertakings, so please don't shy away from reaching out to the Ray team. Ray team are active in the project slack channel to discuss Ray internals, if you have not joined already!
Glad to see this proposal! I'm a newbie to Rust. But based on my experiences from Java and C++ API, here are some thoughts that might be useful.
Btw, we started a thread here; https://ray-distributed.slack.com/archives/C01GJCCA2NT/p1637559584012400
Also, feel free to ping Mingwei Tian
from the public slack channel! We'd love to help making the integration successful :)!
Hi everyone, thank you for the enthusiasm and support!
I apologize for not replying sooner, I've been busy figuring out more details of the problem.
There are indeed many things to discuss.
Building
cxx
allows one to construct a C++/Rust interop bridge, defining interface to classes and methods with safe wrapper types. However, it is not magic and the process is still somewhat manual, similar to JNI or pxd.cxx
and Bazel
. I investigated several ways of building:
Building by manually including include dirs into a Cargo-driven build process via the cc
crate
http_archive
. Even if could be made successful, manual build process is fragile.Using rust-bindgen
/autocxx
msgpack
and absl
, very likely due to the fact that these libraries have their own build rules which are opaque to bindgen.
autocxx
is like bindgen but with safe wrapper types (see cxx
). Since autocxx
relies internally on a fork of bindgen
, I decided not to investigate further since it would likely run into similar issues."just works": cxx
cxx
allows one to construct a C++/Rust interop bridge, defining interface to classes and methods with safe wrapper types. However, it is not magic and the process is still somewhat manual, similar to JNI or pxd.
cxx
and Bazel
.
Packaging
build.rs
file, which would error if the user has not installed Bazel on their system. The build.rs
script will then have to link to Cargo the shared object files produced. CoreWorker
: we will indeed make bindings to CoreWorker
and related types rather than to the C++ API directly.The following is a proposed roadmap for a first phase of work. The order of items is flexible, but the one below is more or less expected.
CoreWorker
/CoreWorkerProcess
, WorkerContext
, gcs::GlobalStateAccessor
CoreWorkerProcess
with configCoreWorkerProcess
/CoreWorker functionality
process_helper.cc
Language::RUST
in common.proto
process_helper
. Edit service.py
CoreWorkerContext
bindings.ObjectID
).Invoker
) via proc macros (in similar fashion to accel
)Notes:
To have a guide on what it takes to implement a Ray API, we study the C++ API.
include:
As for the rest of the cpp/src/ray
:
RayStart
and RayStop
.dll
s from a local path. WorkerContext
, TaskSubmitter
, ObjectStore
, TaskExecutor
, ray::gcs::GlobalStateAccessor
ray::core::WorkerContext
, ray::core::CoreWorker
,ray::gcs::GlobalStateAccessor
and class methods correctly.CoreWorker
)The CoreWorker
manages memory stores, cluster and service connections, and holds the WorkerContext
. sends execution requests to a task execution service, which is an instrumented version of Boost's async executor. It also handles gRPC calls made by other processes in the Ray runtime, such as task submission.
The WorkerContext
holds various metadata about the state of the given worker, including info about IDs, placement groups etc. It also includes the WorkerThreadContext
which manages thread-specific state, such as tasks and placement groups.
The CoreWorkerProcess
is essentially the runtime context for a CoreWorker
that exposes global accessors for a given process. It can even spin up multiple workers.
RayFunction
is an attempt to have a general function metadata in anticipation of language interop on Ray.
Hi, I'm a bot from the Ray team :)
To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.
If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel.
Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.
Please feel free to reopen or open a new issue if you'd still like it to be addressed.
Again, you can always ask for help on our discussion forum or Ray's public slack channel.
Thanks again for opening the issue!
Reopening because I still really want to see this. 😄
I would be interested in helping out here!
@jon-chuang is https://github.com/ray-project/ray/pull/21572 the most recent work that was done on this?
Hi, I'm a bot from the Ray team :)
To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.
If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel.
Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.
Please feel free to reopen or open a new issue if you'd still like it to be addressed.
Again, you can always ask for help on our discussion forum or Ray's public slack channel.
Thanks again for opening the issue!
What's the state of this PR? Will ray have rust binding or support in the future? I see a large PR https://github.com/ray-project/ray/pull/21572, but it is closed due to inactivity. I would love to see rust binding and rust workers in the future.
Bump, Rust support would be awesome.
It was determined that multi-language support for Ray is not a top priority as the Python interface is the most used, and ML applications which largely have python bindings are the top priority.
Generally, many applications can be handled by using python worker as a wrapper over the application code (either via bindings or by acting as the parent process over the application).
I do agree that this can introduce overhead, a C-worker seems like the most natural solution in this case, which the user can bind to their application.
So at the least, exposing a shared library for worker with C-ABI for the user to bind to is a good step without introducing maintenance burden on the Ray team.
What are your use-cases @crazyboycjr @mklifo ?
@jon-chuang jjyao has replied to me under #21572 and I'm totally fine with that answer.
I came up to search around and ask related stuff mainly because one day I felt it would be great to have a runtime that can conveniently dispatch my CPU-intensive jobs written in Rust across my cluster. Ray, apparently is the most suitable backend for this demand in my mind. I'm also asked by others quite a few times if there is any crate that can do this in the Rust ecosystem and the answer seem to be a 'no' for now.
I am open for any further discussion towards enabling Ray to the Rust ecosystem.
The work in https://github.com/ray-project/ray/pull/21572 can be continued if you are interested and I can provide some guidance. Here is a previous plan to split up the work:
The most important part is producing the .so artifact. Here is the implementation for exposing C-FFI to the core_worker: https://github.com/jon-chuang/ray/blob/ce9b206dca518e740852d33630e502eaeec93fca/src/ray/core_worker/lib/c/c_worker.cc
I really appreciate your effort on this direction @jon-chuang ! Unfortunately, I personally won't have spare bandwidth in the foreseen future (a few weeks to maybe months). But I will take a deep look at it someday.
Question: Do you indicate that this 2350 loc + 1100 loc test + ~40K loc in #21572 will be reviewed by relevant Ray developers? That sounds somewhat awful to me... I'm a little worried whether the upstream would be happy to accept this huge PR.
- ~40K loc
This is only due to generated protobuf code. The only code to review is 2350 loc + 1100 loc test
I’m also like to see this happening. I share the same need, to parallelize my rust code using Ray. Not sure where to start helping.
Any update on this?
There's no active progress/plan to support this feature now
There's no active progress/plan to support this feature now
Thanks.
It’s awesome if we can use rust ray together with huggingface/candle
Too bad this isn't progressing.
This is a non-urgent but important feature that we hope the ray team will consider.
Search before asking
Description
Problem Description
Introduction
Ray currently allows for a very attractive distributed programming model that can be embedded within existing programming languages, offering low latency and fine-grained control of tasks and shared objects.
The most prominent example of this embedding is Python, the dominant general purpose language for ML/data science. There are also embeddings for Java and C++ which are commonly used in enterprise systems.
A natural next step, the subject of this proposal, Rust, is often touted as the successor to C++ in modern systems. Its popularity is due to its memory and thread safety, user-friendly, zero-cost functional programming idioms, ergonomic packaging and build system, while retaining C++/C-like performance, avoiding GC unpredictability, and having small memory footprint.
There are many new projects in the data/distributed compute industry building on Rust, including InfluxDB-IOx (time series DB), Materialize (streaming SQL view maintenance) built on Timely/Differential Dataflow, Datafusion/Ballista/Arrow-rs (SQL query engine), Databend.rs (realtime analytics), Fluvio (distributed streaming), which can all run distributed over a cluster, constellation-rs/the Hadean SDK (distributed compute platform), polar-rs (dataframe lib), delta-rs (Apache Delta table interface).
We expect that the number of such systems to grow going forward, possibly including next-gen distributed simulation/real-time engines (e.g. games, self-driving, traffic, large scale physics), distributed computing (graphs), databases and query engines, and other forms of distributed execution.
Exposing a Rust API would allow the growing Rust community to leverage Ray's programming model and possibly drive improvements in the underlying Ray system.
Considerations
The Rust community may not like the thought of using a C++ library (being memory and thread unsafe) under the hood as opposed to a pure Rust library. But as these things go, the benefits may outweigh the reservations.
Alternative libraries for distributed computation also exist in Rust, such as
timely-dataflow
andconstellation-rs
. The former is dataflow-based with automatic pipelined communication focusing on data-parallel workloads, and the latter is process-based with explicit communication and (I believe) no built-in fault tolerance, with a spinoff libraryamadeus
doing map reduce style data-parallel stream computation.However, just like is the case for many of Ray's workloads, this style of distributed computation may not be suitable to the types of tasks being run, which may demand more fine-grained control, while programming with explicit communication may have high cognitive overhead.
Requirements of a Worker
A worker must be able to:
core_worker
.Objective
The Current Structure of the C++ API
The C++ API exposes a minimal runtime interface (native and local mode):
Here is the main runtime API.
Local mode is running on a single node, in a single process and without RPC, mainly for testing. We will begin with developing the local mode API for approach validation and fast iteration.
It also exposes the include files which go beyond the basic Ray API, including:
Finally, the C++ API has the following utils:
Approach
The approach is to use either the
autocxx
crate or thecxx
crate directly to generate a set of workable bindings, either directly to the C++ API if this is feasible, thecore_worker
directly, or a hybrid of both if deemed necessary, whichever is the happier path. Tests will be created on the Rust side to test out all of the functionality, including more expensive cluster mode integration tests.Using these tests or otherwise, we will try to find and fix last mile issues, such as functionality that may not play well across language boundaries (e.g. reference counting).
We will use Rust's procedural macros to instantiate tasks and actors which can provide a similarly pleasant API to Python's decorators. We may, in addition, provide idiomatic instantiations adding options as mutating methods to tasks, as those seen in the C++/Java APIs.
Roadmap
Test Cases
As a test case, I'd like to try implementing one option for distributed job scheduling for the Ballista distributed SQL engine (which differs from Spark SQL in having a native runtime with a smaller memory footprint). The current state of the job scheduling there is rather primitive. Possibly, Ray could help with query execution that exploits data locality rather than building such scheduling logic from scratch.
As a second test case, I would like to try to implement timely dataflow on top of ray. Perhaps this could allow for streaming SQL queries on top of Ballista/DataFusion. Although I worry about memory usage.
Future Directions
Cross-language:
Async actors
Multi-threaded actors/tasks
User-specified compression scheme
RaySerialization
Direct buffer writing
Use case
No response
Related issues
No response
Are you willing to submit a PR?