jina-ai / serve

☁️ Build multimodal AI applications with cloud-native stack
https://jina.ai/serve
Apache License 2.0
21.13k stars 2.22k forks source link

optimizer: GridSampler should automatically construct search space from parameters.yml #2906

Closed cristianmtr closed 3 years ago

cristianmtr commented 3 years ago

Currently we have to manually add the space and pass it as a kwarg

RaghavPrabhakar66 commented 3 years ago

@cristianmtr I would like to take this issue.

maximilianwerk commented 3 years ago

Hey @RaghavPrabhakar66, thanks a lot for the initiative. Let me give you some guidance, what needs to be done:

After finishing this ticket, the following special handling for the GridSampler should not be necessary anymore: https://github.com/jina-ai/jina/blob/c0b04ffae9f2db818e56e170c5ce4efdb64497bb/tests/integration/optimizers/test_optimizer.py#L71

So the grid_sampler_search_space should be build in the FlowOptimizer.opimize_flow method. Most probably the best way is to add a get_values_list to all classes inheriting from OptimizationParameter. For the following classes computing an exhaustive list is straight forward: IntegerParameter, CategoricalParameter, ExecutorAlternativeParameter. For the other classes we

I hope this helps. If you have any further questions, please raise them as they appear.

Cheers!

RaghavPrabhakar66 commented 3 years ago

Hi, Thanks for the guidance. I have some doubts regarding the implementation. I cam across load_optimization_parameters() function and it return a list objects of all params. I was wondering if we could use that rather than creating a method in every class. It is based on the assumption that objects returned by load_optimization_parameters have low, high, step_size attributes. It could work like this

self.parameters = load_optimization_parameters(self._parameter_yaml)

self.search_space = {}
for parameter in self.parameters:
    self.search_space[parameter.jaml_variable] = [parameter.low, parameter.high, parameter.step_size]

What are your thoughts? @maximilianwerk

maximilianwerk commented 3 years ago

It is based on the assumption that objects returned by load_optimization_parameters have low, high, step_size attributes.

This assumption is wrong. Take a CategoricalParameter: It only has choices as an attribute. It could be something like model_name with values Bert1, Bert2 and MyOtherFancyModel.

RaghavPrabhakar66 commented 3 years ago

Hi @maximilianwerk I have some doubts regarding the implementation. Can you share some examples ExecutorAlternativeParameter. Parameters such as UniformParameter, LogUniformParameter, DiscreteUniformParameter have attributes such as low, high, so they can easily extracted by add_to_list() method. Am I right?

maximilianwerk commented 3 years ago

Hi @maximilianwerk I have some doubts regarding the implementation. Can you share some examples ExecutorAlternativeParameter. Parameters such as UniformParameter, LogUniformParameter, DiscreteUniformParameter have attributes such as low, high, so they can easily extracted by add_to_list() method. Am I right?

I am not sure, if I get you question. The target of this issue is to build the grid_sampler_search_space automatically in the test I mentioned above. For UniformParameter, LogUniformParameter and DiscreteUniformParameter you need to sample some values (ideally in a reliable way) such that you get a discrete list of values. The ExecutorAlternativeParameter is not easily combinable with the GridSampler, since the parameters are separate for each alternative Executor.