kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.51k stars 442 forks source link

Use Kubeflow metadata for metrics collection #862

Open jlewi opened 5 years ago

jlewi commented 5 years ago

/kind feature

Describe the solution you'd like Right now Katib depends on logging the metrics to stdout (see #685).

It would be nice if instead Katib could be configured to use Kubeflow metadata to obtain the metrics.

Here's a strawman for how this might work

  1. User adds logging statement to their code to log metrics to metadata with an appropriate set of labels (e.g. experiment & trial)
  2. Katib use a selector to match trials to metrics in metadata

It seems natural for folks to instrument their code to log metrics to metadata.

Furthermore, using the metadata SDK to log metrics should mean logging metrics to metadata is no more difficult then logging to stdout.

A side benefit would be that this avoids some of the sideffects of using side cars to fetch logs from stdout (#685)

/cc @zhenghuiwang @johnugeorge @gaocegege

hougangliu commented 5 years ago

@jlewi @zhenghuiwang In fact, all metrics have been persisted into Katib DB (now we only implement mysql driver). and we can implement a new DB driver for kubeflow metadata, just like mysql counterpart.

jlewi commented 5 years ago

Out of the box integration with metadata would be awesome.

gaocegege commented 5 years ago

Not sure the requirements of metadata. Now we only use katib-db to store metrics. If metadata does not require any other abstraction, I think it should be easy to support it.

johnugeorge commented 5 years ago

Related: https://github.com/kubeflow/katib/issues/841#issuecomment-537413455

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

andreyvelich commented 3 years ago

/lifecycle frozen