Alanthink / banditpylib

A lightweight python library for bandit algorithms
https://alanthink.github.io/banditpylib-doc/
MIT License
29 stars 5 forks source link

time-varying signals #13

Open htcml opened 4 years ago

htcml commented 4 years ago

Can you add MAB algorithms which can handle time-varying signals? Maybe non-stationary MAB algorithms are for this purpose?

choltz95 commented 4 years ago

I can implement one of the exponential weighted methods like exp3 or a variant. There can also be simple modifications of existing algorithms like sliding window UCB or any of the adversarial strategies. I think the main part of the implementation will be the environment.

htcml commented 4 years ago

Just point this out for your reference. Daniel Russo outlines his approach in p. 43 section 6.3 non-stationary system of this tutorial: https://web.stanford.edu/~bvr/pubs/TS_Tutorial.pdf

Alanthink commented 3 years ago

EXP3 is currently implemented in the ordinary multi-armed bandit.