NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.23k stars 164 forks source link

Question on AIStore cluster metrics #109

Closed KavyaPuranik closed 2 years ago

KavyaPuranik commented 2 years ago

Hi,

How do I collect the metrics of ais-node application that is deployed using ais-operator style? Is it similar to the helm chart way? If not, can we have a doc on that please.

alex-aizman commented 2 years ago

This was written at different times ("prometheus" more recently), and so there's a bit of an overlap.

The same is also visible via https://aiatscale.org/docs (see "Observability").

In the sources, most of the variables are statically named and listed in one of these:

KavyaPuranik commented 2 years ago

Thank you!