Create UCB1-tuned bandit endpoint

Expanding on #354

REST endpoint url/bandit/ucb

* Added variance to SampleArm, default is None. If variance is not given, we compute variance = total \ p(1-p) where p = win / total (variance of Bernoulli trials).

We throw an error is loss is not 0 because this endpoint works with the basic case where a failure results in zero loss.

Request: { "subtype": "UCB1-tuned", "historical_info": { "arms_sampled": { "arm1": {"win": 20, "loss": 0, "total": 25, "variance": 0.1}, "arm2": {"win": 20, "loss": 0, "total": 30, , "variance": 0.2}, "arm3": {"win": 0, "loss": 0, "total": 0}, }, }, }

Response: { "arm_allocations": { "arm1": 0.0, "arm2": 0.0, "arm3": 1.0, } "winner": "arm3", }

Yelp / MOE

Create UCB1-tuned bandit endpoint #366