Summary:
I'm updating the training logic of LinUCB to keep track of the average values of A and b instead of cumulative values. This should improve the numerical stability of training by preventing numerical overflows.
The average values are aggregated among the trainers and among epochs when computing the coefficients.
Summary: I'm updating the training logic of LinUCB to keep track of the average values of
A
andb
instead of cumulative values. This should improve the numerical stability of training by preventing numerical overflows.The average values are aggregated among the trainers and among epochs when computing the coefficients.
Differential Revision: D42334470