canonical / kubeflow-tensorboards-operator

Tensorboards Operator
Apache License 2.0
2 stars 6 forks source link

Add relation and expose metrics from tensorboard-controller #131

Open rgildein opened 3 months ago

rgildein commented 3 months ago

Context

Implement metrics-endpoint for the tensorboard-controller, after #7633 is fixed in upstream and changes are included in release (e.g. 1.10).

We need to add relation using prometheus_scrape interface and grafana_dashboard interface if any dashboard is provided.

What needs to get done

  1. Expose port for metrics as ServicePort
  2. Implement metrics-endpoint endpoint
  3. [Optional] Implement grafana-dashboard endpoint
  4. [Optional] Find and add alert rules for metrics from upstream
  5. [Optional] Find and add Grafana dashboards for metrics from upstream

Definition of Done

  1. tensorboard-controller has metrics-endpoint and grafana-dashboard endpoints
  2. Integration with cos is tested
syncronize-issues-to-jira[bot] commented 3 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6107.

This message was autogenerated