-
First of all, thank you for fixing my issues #62.
I want to setting my CB model with custom 'choice_names' (integer) for using serial number of choices in my example data. I got TypeError, when …
-
- I am trying to create a solution on AWS Personalise using custom hyperparameter config.
- This is the error I am facing.
```
InvalidInputException Traceback (most recent call …
-
### Describe the bug
Hello,
I am currently working on the conditinal contextual combinatorial bandit, i.e. ccb. I have 4 slots and a total of 12 actions. I am training my algorithm with the fo…
-
Hi,
I noticed that in the `fit` function of `_ThompsonSampling`, `contexts` is never passed to `self._parallel_fit(decisions, rewards)`. https://github.com/fidelity/mabwiser/blob/master/mabwiser/th…
-
Hello all!
I have been playing with LinUCB in an attempt to set up a recipe recommendation system using historical data. I have read through the original LinUCB paper as well as http://www.gatsby.u…
-
Hi Mark,
We currently use the `LoggedInteraction`'s IPS estimator to compare the accumulated reward of VW models with non-VW baselines, such as the random policy, to analyze if there's something fo…
-
- Arthur Juliani. [Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflow](https://medium.com/hackernoon/learning-policies-for-learning-policies-meta-reinforcement-…
-
The example in https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Logged-Contextual-Bandit-Example assumes one knows the action probabilities. However, in many cases, these probabilities are unknown a…
-
@41ow1ives @ryubright
-
At first, thank you a lot for your contributions. They are very valuable to improve my understanding of the original paper.
I have a fundamental question regarding the implementation of the [Neural…