Upper Confidence Bound Learning.The Given Code is used to find which out of the 10 ads to be displayed on website for maximum Click Through Response by the user.
The dataset is virtual showing what the ith user would have done if one of the 10 ads was shown to him i.e. 1 specifying he would have clicked it and 0 means he would have ignored the add.
The 3 steps used for the mathematics given behind the algorithm is given for a better understanding.
Upper Confidence Bound Learning.The Given Code is used to find which out of the 10 ads to be displayed on website for maximum Click Through Response by the user. The dataset is virtual showing what the ith user would have done if one of the 10 ads was shown to him i.e. 1 specifying he would have clicked it and 0 means he would have ignored the add. The 3 steps used for the mathematics given behind the algorithm is given for a better understanding.