Open chengsoonong opened 7 years ago
I would like to implement this for active learning, but I am having trouble seeing how Bayesian optimisation is equivalent to active learning. I found this paper (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.724.7020) and have started making my way through it.
Perhaps this could be achieved by subclassing Recommender
and calling one step of hyperopt.
I still can't think of how to phrase active learning as a Bayesian Optimisation problem… Have you thought about this, @MatthewJA?
I think you would optimise over the feature space, and then do a nearest-neighbour lookup to the optimum point.
That makes sense. What's your objective function though?
(This is very similar to earlier active learning strategies that just looked to argmax over the feature space instead of over the input data points.)
Good question... maybe expected model change?
In the end you want to make your model maximally good. Maybe directly optimising a loss like that against a validation set is a good idea.
Can you optimise the loss without knowing the label?
Also, optimising the expected model change sounds reasonable. If you have a finite set of unlabelled examples, then Bayesian optimisation becomes just a heuristic to avoid having to evaluate the function for every unlabelled example. It makes a heap of sense if you can make up any example and give it to an oracle, since it's hard to optimise these objective functions without using Bayesian optimisation.
You could optimise a proxy to the loss (e.g. loss on labelled set). It was not uncommon a while back to consider scenarios where you generated queries rather than sampled them, but I'm having trouble finding a good reference for that.
My question was more: you're trying to pick an example to label. How do you know how it will affect the loss without actually knowing this label?
You have a probability from your predictor! Weight your results by that.
http://hyperopt.github.io/hyperopt/