Closed jlewi closed 6 years ago
support grid search to begin with?
The more I think about it, the more I think this should not be handled by TfJob
but by a higher abstraction, such as the dashboard and other tools that will ultimately interact with TfJob
.
Users might want to do different strategies that may be difficult to express in YAML, but easier in a frontend.
Would there be any optimization that would be possible if we handle HP tuning directly at the TfJob level versus higher up?
I don't think HP Tuning should be handled by TfJob. I think a hyperparameter tuning system should be implemented as a set of loosely coupled components. Here are some components I see
I think we can define appropriate APIs and Interfaces for each component so that particular components aren't tied to implementation details of other components. For example, I don't there's any reason why doing a GridSearch should care whether we are training a DeepNet using TF or a decision tree using XGboost model.
I don't necessarily want to focus on the HP Tuning algorithms. That's beyond my expertise. I'd rather focus on building the infrastructure that makes it easy to plugin in new algorithms.
from API's point of view, doing HP tuning has no difference from firing up a bunch of parallelable TfJobs (and of course won't care about their relationships, just that they are independent).
Is it arriving in productioction: https://deepmind.com/blog/population-based-training-neural-networks/?
Would love to see hyper-param tuning added to Kubeflow ... maybe we dont need something as sophisticated like https://research.google.com/pubs/pub46180.html but having something simple for starters might be good
@ddutta
Google Vizier is a good reference for us since it applies at large scale. And there is an open source implementation in Jupiter Notebook: https://github.com/tobegit3hub/advisor and we could have a try on it to see if it is what we want to implement.
@gaocegege Thx. We will try it out. We have a version which we could contribute/merge too.
Hi @gaocegege @ddutta @Jimexist @wbuchwalter @bhack @jlewi I have also been interested in parameter tuning system and I'm developing vizier clone that can integrate with kubernetes. And tuning system itself also work on kubernetes. It’s mostly functional. I use it internally for our team. It supports grid, random, and hyperband search algorithm. I hope to collaborate with KubeFlow community!
The code is here . I will be so happy to get your comments and feedbacks.
@YujiOshima
Thanks for the information! I will take a look.
And we created a channel hyperparameter-tuning
in slack: https://kubeflow.slack.com/messages/C9ZLKR73L/ Welcome comments :tada:
/cc @DjangoPeng @ddysher
This is the client library (to show the APIs) of the tool we built internally --> https://github.com/CiscoAI/hyper-advisor-client.git
I am closing the issue since we have a new repo for hyperparameter tuning: https://github.com/kubeflow/hp-tuning
Thank you all :tada:
Opening this issue to see if there is any interest in adding capabilities to manage hyperparameter tuning.