canonical / kubeflow-tensorboards-operator

Tensorboards Operator
Apache License 2.0
2 stars 6 forks source link

Add relation and expose metrics from tensorboard-controller #122

Closed rgildein closed 1 week ago

rgildein commented 1 month ago

Context

The tensorboard-controller deployment already provides metrics endpoint, however charm is not using this and do not provide metrics-endpoint and grafana-dashboard endpoints.

We need to add relation using prometheus_scrape and grafana_dashboard interfaces.

Note: Only maintained alert rule or Grafana dashboard should be selected, others should be avoided.

What needs to get done

  1. Expose port for metrics as ServicePort
  2. Implement metrics-endpoint endpoint
  3. Implement grafana-dashboard endpoint
  4. [Optional] Find and add alert rules for metrics from upstream
  5. [Optional] Find and add Grafana dashboards for metrics from upstream

Definition of Done

  1. tensorboard-controller has metrics-endpoint and grafana-dashboard endpoints
  2. Integration with cos is tested
syncronize-issues-to-jira[bot] commented 1 month ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6007.

This message was autogenerated

rgildein commented 1 week ago

Fixed by #130