kubeflow / common

Common APIs and libraries shared by other Kubeflow operator repositories.
Apache License 2.0
51 stars 73 forks source link

Add proposal for Prometheus metrics coverage #77

Closed terrytangyuan closed 4 years ago

terrytangyuan commented 4 years ago

This provides a detailed outline of the Prometheus metrics we plan to coverage in common operator. Related issue: https://github.com/kubeflow/common/issues/22.

Signed-off-by: terrytangyuan terrytangyuan@gmail.com

kubeflow-bot commented 4 years ago

This change is Reviewable

terrytangyuan commented 4 years ago

/cc @ywskycn @Jeffwan @gaocegege @richardsliu @johnugeorge @merlintang @jian-he @carmark

terrytangyuan commented 4 years ago

Thanks everyone for the comments! I've converted the lists to tables which include the metric name, type, and description. I also added a few additional metrics as suggested. Hopefully it's much clearer now. Please take another look.

Jeffwan commented 4 years ago

Beside above minor comments, it looks good to me. Wait to see if someone else has the feedback

terrytangyuan commented 4 years ago

/assign @gaocegege @johnugeorge

terrytangyuan commented 4 years ago

@yeya24 Thanks! Great suggestions. I have updated the metric types in the doc.

PTAL @gaocegege @johnugeorge @Jeffwan

Jeffwan commented 4 years ago

/lgtm

terrytangyuan commented 4 years ago

/approve

k8s-ci-robot commented 4 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: terrytangyuan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubeflow/common/blob/master/OWNERS)~~ [terrytangyuan] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment