Open KeyOfSpectator opened 8 months ago
@KeyOfSpectator we actually offer a managed Ray Dashboard offering as part of the Anyscale Platform. See here https://www.anyscale.com/platform for more details.
We also offer managed offering flavors specific to LLM serving in Anyscale Endpoints and Anyscale Private Endpoints
This is not something we can commit to for now due to huge backlog of items we have.. Contribution is welcome.
At the same time, as @anyscalesam mentioned, you can try the managed Ray product so that you don't need to worry about managing the grafana/prometheus by yourself.
ok, thx. I will have a try the managed ray first, but maybe our infra structure is settled. if there need some contribution, maybe i can give some commit.
Description
When we deploy Ray / Kuberay to large-cluster and have big scale of data. We need better performance and higher availability of Prometheus + Grafana.
like alibaba managed-prometheus and managed grafana: https://www.alibabacloud.com/product/prometheus
and aws managed-prometheus and managed grafana: https://aws.amazon.com/cn/prometheus/
I found the implement in ray project, have a hard code. We need judge Prometheus/Grafana is healthy, then we can have our Grafana Host and IFrame address showed in the Ray dashboard. https://github.com/ray-project/ray/blob/master/dashboard/modules/metrics/metrics_head.py#L119C39-L119C39
is this possible to give a config, enable/disable the healty check of Grafana/Prometheus?
Use case
When we deploy Ray / Kuberay to large-cluster and have big scale of data. We need better performance and higher availability of Prometheus + Grafana.
like alibaba managed-prometheus and managed grafana: https://www.alibabacloud.com/product/prometheus
and aws managed-prometheus and managed grafana: https://aws.amazon.com/cn/prometheus/