ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.75k stars 5.74k forks source link

[BUG] Ray Dashboard: GPU stats per actor is empty #48312

Open aviadshimoni opened 6 days ago

aviadshimoni commented 6 days ago

What happened + What you expected to happen

Ray deployments don’t have GPU/GRAM tracking (“Actors” section) in Ray dashboard, stats are shown in Clustet tab (images attached). image (24) https://ray.slack.com/files/U05D35JHGUV/F07TJAFKMTR/image.png?origin_team=TN4768NRM&origin_channel=CMVUQ1KMX

Is this intended? are we having issues aggregating GPU stats per actor?

Versions / Dependencies

KubeRay: 1.1.1 Ray Version: 2.34.0 CRD: v1

Docker image: rayproject/ray:2.34.0-py310-gpu

Reproduction script

Deploy any ray service, access it's dashboard and see 'Actors' Tab on top navigation bar.

Issue Severity

Medium: It is a significant difficulty but I can work around it.

aviadshimoni commented 6 days ago

Ray Slack: https://ray.slack.com/archives/CNCKBBRJL/p1730113922799889