Few-shot HPO is a both promising-performance and budget-friendly HPO solution. It is divided into two processes: offline processing and online processing. In offline processing, it searches on a given search space over a bunch of datasets and selects a hyper-parameter set {c} based on certain criteria. The dimension of search space and size of datasets can be very large in order to cover as many distribution of dataset as possible. In online processing, it picks N hyper-parameter configurations from {c} based on user's training dataset. If N=1, this HPO algorithm is called zero-shot as it only attempts once on training dataset. If N >1, this HPO algorithm is called few-shot.
The advantage of few-shot/zero-shot HPO is it introduces prior knowledge learned from offline process, which helps resolve cold-start problem. And with the help of prior knowledge, it also prevent exploring dead-end configurations.
What will benefit most from few-shot/zero-shot HPO
Deep learning scenarios like image-classification and NLP tasks.
How will few-shot/zero-shot HPO be leveraged in ML.Net
few-shot/zero-shot HPO will be leveraged as a tuning algorithm in AutoML.Net, just like other tuning algorithms.
Work items
[x] add autozero tuner
[x] integrate with binary classification experiment
[ ] integrate with multi-classification experiment
@LittleLittleCloud I'll add this to the "Future" milestone, but if you are planning on finishing it before the next major release please change it to the 4.0 milestone.
What's few-shot/zero-shot HPO
Few-shot HPO is a both promising-performance and budget-friendly HPO solution. It is divided into two processes: offline processing and online processing. In offline processing, it searches on a given search space over a bunch of datasets and selects a hyper-parameter set {c} based on certain criteria. The dimension of search space and size of datasets can be very large in order to cover as many distribution of dataset as possible. In online processing, it picks N hyper-parameter configurations from {c} based on user's training dataset. If N=1, this HPO algorithm is called zero-shot as it only attempts once on training dataset. If N >1, this HPO algorithm is called few-shot.
The advantage of few-shot/zero-shot HPO is it introduces prior knowledge learned from offline process, which helps resolve cold-start problem. And with the help of prior knowledge, it also prevent exploring dead-end configurations.
What will benefit most from few-shot/zero-shot HPO
Deep learning scenarios like image-classification and NLP tasks.
How will few-shot/zero-shot HPO be leveraged in ML.Net
few-shot/zero-shot HPO will be leveraged as a tuning algorithm in AutoML.Net, just like other tuning algorithms.
Work items