Open ianlini opened 8 years ago
How do we define T?
The problem here is that we do not know the exact value of N and T for initialization. T is the total times of recommendation, while N is the number of experts.
I think we should fix N. What happens if we have more than T rounds?
I think the actions and experts should both be fixed... I don't think Exp4.P can handle changes of actions and experts reasonably... This is a big change, any idea? @yangarbiter @stegben @SoluMilken
Yeah, the original EXP4P cannot handle new actions and experts. For new actions, if we retrain our experts, I think it's still okay. But for new experts, I think the original algorithm could not handle this case.
After retraining the experts, I don't think the weight can still work, and the new weight of a new action is also a problem.
https://github.com/ntucllab/striatum/blob/master/striatum/bandit/exp4p.py#L64 What's this? Any reference?