kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.45k stars 428 forks source link

Whether the hyperparameter search algorithm will refer to the value of additionalMetricNames #2286

Open YKL2436542696 opened 3 months ago

YKL2436542696 commented 3 months ago

If not, is the purpose of the additionalMetricNames parameter just for visualization? It will not affect the experimental results? Please help me answer this question, thank you very much !!!🙏

andreyvelich commented 3 months ago

That is correct @YKL2436542696

Currently, additionalMetricNames is only used for metrics tracking purposes. So you can get them in the Katib UI or using Katib SDK. We have an open issue to support multi-objective optimization, but it has not been implemented yet: https://github.com/kubeflow/katib/issues/1549

YKL2436542696 commented 3 months ago

@andreyvelich Thank you。 Can I understand it this way? The strategies for additionalMetricNames in Spec.Objective.MetricStrategies currently have no practical effect. (This may work when multi-objective optimization is supported in the future)

YKL2436542696 commented 3 months ago

And I noticed that the metrics values output by the "/katib/fetch_hp_job_info/" interface in the katib-ui module are based on spec.Objective.Type and have nothing to do with Spec.Objective.MetricStrategies。 I'm curious why the metric values are not output according to the strategy defined by Spec.Objective.MetricStrategies 🤯

andreyvelich commented 3 months ago

The strategies for additionalMetricNames in Spec.Objective.MetricStrategies currently have no practical effect.

That's right, we check if Experiment goal is reached based on objective metric name and its metrics strategy: https://github.com/kubeflow/katib/blob/master/pkg/controller.v1beta1/experiment/util/status_util.go#L166

However, we send those metrics to the suggestion service based on the strategies: https://github.com/kubeflow/katib/blob/master/pkg/controller.v1beta1/suggestion/suggestionclient/suggestionclient.go#L419. So if user implements custom algorithm service that analyses all Trial metrics that might give some value.

andreyvelich commented 3 months ago

I'm curious why the metric values are not output according to the strategy defined by Spec.Objective.MetricStrategies

It's a good point, actually UI should just take Trial metrics results from .status.observation and based on strategies should take the appropriate value, not from metrics logs as we do right now: https://github.com/kubeflow/katib/blob/master/pkg/ui/v1beta1/hp.go#L156-L166 Like what we do for Suggestion Service. Thanks for reporting!

/bug /area frontend /good-first-issue

google-oss-prow[bot] commented 3 months ago

@andreyvelich: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/kubeflow/katib/issues/2286): >>I'm curious why the metric values are not output according to the strategy defined by Spec.Objective.MetricStrategies > >It's a good point, actually UI should show just take Trial metrics results from `.status.observation`, not from metrics logs as we do right now: https://github.com/kubeflow/katib/blob/master/pkg/ui/v1beta1/hp.go#L156-L166 >Thanks for reporting! > >/bug >/area frontend >/good-first-issue Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
andreyvelich commented 3 months ago

/kind bug

live2awesome commented 3 months ago

i am interested in solving this /assign

xr-dev-saurabh commented 3 months ago

/assign

andreyvelich commented 3 months ago

Thank you for your interest @xr-dev-saurabh, as I can see @live2awesome is already working on this issue. Please feel free to work on other issues that don't have assignee /assign @live2awesome

xr-dev-saurabh commented 3 months ago

/unassign

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

andreyvelich commented 1 week ago

@xr-dev-saurabh @live2awesome Do you still want to work on this issue ?