Point72 / raydar

A perspective powered, user editable ray dashboard via ray serve.
https://github.com/Point72/raydar/wiki
Apache License 2.0
30 stars 1 forks source link
dashboard observability python ray scheduler tracing

raydar

Build Status GitHub issues PyPI Version License

A perspective powered, user editable ray dashboard via ray serve.

Ray offers powerful metrics visualizations powered by graphana and prometheus. Although useful, the setup can take time - and customizations can be challenging.

Raydar, enables out-of-the-box live cluster metrics and user visualizations for Ray workflows with just a simple pip install. It helps unlock distributed machine learning visualizations on Anyscale clusters, runs live and at scale, is easily customizable, and enables all the in-browser aggregations that perspective has to offer.

Example

Features

More information is available in our wiki

Installation

raydar can be installed via pip or conda, the two primary package managers for the Python ecosystem.

To install raydar via pip, run this command in your terminal:

pip install raydar

To install raydar via conda, run this command in your terminal:

conda install raydar -c conda-forge

Launching The UI, Tracking Tasks, Creating/Updating Custom Tables

The raydar module provides an actor which can process collections of ray object references on your behalf, and can serve a perspective dashboard in which to visualize that data.

from raydar import RayTaskTracker
task_tracker = RayTaskTracker(enable_perspective_dashboard=True)

Passing collections of object references to this actor's process method causes those references to be tracked in an internal polars dataframe, as they finish running.

@ray.remote
def example_remote_function():
    import time
    import random
    time.sleep(1)
    if random.randint(1,100) > 90:
        raise Exception("This task should sometimes fail!")
    return True

refs = [example_remote_function.remote() for _ in range(100)]
task_tracker.process(refs)

The perspective UI is served on port 8000 by default.

Example

Passing a name and namespace arguments allows the RayTaskTracker to skip construction when an actor already exists. This also means we can access the correct ray actor handle from arbitrary ray code, once the correct name and namespace are provided.

from raydar import RayTaskTracker

task_tracker = RayTaskTracker(
    enable_perspective_dashboard=True,
    name="my_actor_name",
    namespace="my_actor_namespace”
)

task_tracker.create_table(
    table_name="demo_table",
    table_schema=dict(
        worker_id="str",
        metric_value="int",
        other_metric_value="float",
        timestamp="datetime”
    )
)

Now, from an arbitrary remote function:

@ray.remote
def add_data_to_demo_table(i):
    task_tracker = RayTaskTracker(name="my_actor_name", namespace="my_actor_namespace")

    import datetime
    import random
    data = dict(
        worker_id="worker_1",
        metric_value=i,
        other_metric_value=i * random.uniform(1.5, 1.8),
        timestamp = datetime.datetime.now()
    )
    task_tracker.update_table("demo_table", [data])

Example

FAQ

Currently, in memory. There are plans to integrate alternatives to this configuration, but currently the data is stored in machine memory on the ray head.

The Save Layout button saves a json file containing layout information. Dragging and dropping this file into the UI browser window restores that layout.

Example

License

This software is licensed under the Apache 2.0 license. See the LICENSE file for details.