kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.5k stars 441 forks source link

GetSuggestion API change #463

Closed johnugeorge closed 5 years ago

johnugeorge commented 5 years ago

Currently GetSuggestionsRequest and GetSuggestionsReply structure are as follows

message GetSuggestionsRequest {
    string experiment_name = 1;
    string algorithm_name = 2;
    int32 request_number = 3; ///The number of Suggestion you request at one time. When you set 3 to request_number, you can get three Suggestions at one time.
}

message GetSuggestionsReply {
    repeated Trial trials = 1;
}
  1. Currently, GetSuggestions and ValidateAlgorithmSettings API calls are called to corresponding suggestion service via Manager Service. (https://github.com/kubeflow/katib/blob/master/cmd/manager/v1alpha2/main.go#L138) Is it actually needed? Why can't the controller directly connect to the suggestion service?

2.Can GetSuggestionsReply have just ParameterAssignments(parameter name-value pair) instead of entire Trial? Suggestion algorithm need not construct entire Trial object which is a overhead. eg: Trial has TrialSpec and TrialStatus. SuggestionAlgorithm should not call CreateTrial. Instead Creating Trial object from ParameterAssignments should be left to the controller.

/cc @richardsliu /cc @YujiOshima /cc @jdplatt /cc @andreyvelich

WDYT?

richardsliu commented 5 years ago

I agree, returning ParameterAssignments is more consist with the semantics of a Get API. Also since we are defining Trial as its own resource type, its lifecycle should be handled by a controller.

andreyvelich commented 5 years ago
  1. Is it possible to call this functions from controller directly?
  2. As well, we send StudyID in Trial structure. You can see it here: https://github.com/kubeflow/katib/blob/master/pkg/suggestion/v1alpha1/nasrl_service.py#L356. We can define this structure like that:
message GetSuggestionsReply {
   string study_id = 1;
   message ParameterAssignments{
        repeated ParameterAssignment assignments = 1;
    }
    ParameterAssignments parameter_assignments = 2;
}

What do you think @johnugeorge ?

richardsliu commented 5 years ago

@andreyvelich For NAS wouldn't the GetSuggestionsReply need to contain the architecture and operations?

andreyvelich commented 5 years ago

@richardsliu Yes, but we store it in two different categorical parameters, architecture and nn_config: https://github.com/kubeflow/katib/blob/master/pkg/suggestion/v1alpha1/nasrl_service.py#L357

gaocegege commented 5 years ago

Currently, GetSuggestions and ValidateAlgorithmSettings API calls are called to corresponding suggestion service via Manager Service. (https://github.com/kubeflow/katib/blob/master/cmd/manager/v1alpha2/main.go#L138) Is it actually needed? Why can't the controller directly connect to the suggestion service?

I think we could eliminate manager as a GRPC server. I am working on a proposal for it.

gaocegege commented 5 years ago

@johnugeorge I have a proposal to eliminate katib-manager as a proxy between suggestion and experiment, PTAL #464

johnugeorge commented 5 years ago

/assign

YujiOshima commented 5 years ago

@johnugeorge I agree. You can call GetSuggestion from controller same as the manager do now. In GetSuggestion process, the manager is just a proxy. And only contain Assignments in GetSuggestionsReply will be more efficient.

gaocegege commented 5 years ago

/close by #743

gaocegege commented 5 years ago

/close

k8s-ci-robot commented 5 years ago

@gaocegege: Closing this issue.

In response to [this](https://github.com/kubeflow/katib/issues/463#issuecomment-532111064): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.