contextual-bandit Search Results

254 results
for contextual-bandit

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

david-cortes/contextualbandits #63

TypeError: contextual bandits with custom 'choice_names' (on…

First of all, thank you for fixing my issues #62. I want to setting my CB model with custom 'choice_names' (integer) for using serial number of choices in my example data. I got TypeError, when …

shyun46 updated 1 year ago
3
aws-samples/amazon-personalize-samples #107

Pass custom solutionConfig while creating solution via pytho…

- I am trying to create a solution on AWS Personalise using custom hyperparameter config. - This is the error I am facing. ``` InvalidInputException Traceback (most recent call …

shim1998 updated 1 year ago
4
VowpalWabbit/vowpal_wabbit #4406

Conditional Contextual Bandit - predictions does not predict

### Describe the bug Hello, I am currently working on the conditinal contextual combinatorial bandit, i.e. ccb. I have 4 slots and a total of 12 actions. I am training my algorithm with the fo…

vitorkrasniqi updated 1 year ago
3
fidelity/mabwiser #67

`context` isn't passed to `_parallel_fit` in Thompson Sampli…

Hi, I noticed that in the `fit` function of `_ThompsonSampling`, `contexts` is never passed to `self._parallel_fit(decisions, rewards)`. https://github.com/fidelity/mabwiser/blob/master/mabwiser/th…

clarkzjw updated 1 year ago
3
tensorflow/agents #672

Offline Contextual Bandits

Hello all! I have been playing with LinUCB in an attempt to set up a recipe recommendation system using historical data. I have read through the original LinUCB paper as well as http://www.gatsby.u…

alex-seto updated 1 year ago
3
VowpalWabbit/coba #38

Advanced estimator for model performance evaluation

Hi Mark, We currently use the `LoggedInteraction`'s IPS estimator to compare the accumulated reward of VW models with non-VW baselines, such as the random policy, to analyze if there's something fo…

jonastim updated 1 year ago
48
NorbertZheng/read-papers #69

Arthur Juliani | Learning Policies For Learning Policies -- …

- Arthur Juliani. [Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflow](https://medium.com/hackernoon/learning-policies-for-learning-policies-meta-reinforcement-…

NorbertZheng updated 1 year ago
7
VowpalWabbit/vowpal_wabbit #3910

logged contextual bandits without probabilities

The example in https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Logged-Contextual-Bandit-Example assumes one knows the action probabilities. However, in many cases, these probabilities are unknown a…

chanansh updated 2 years ago
7
boostcampaitech4lv23recsys1/level2_movierecommendation_recsys-level2-recsys-01 #14

[강의] 9강 - Applications of Personalized Machine Learning

@41ow1ives @ryubright

41ow1ives updated 1 year ago
1
sauxpa/neural_exploration #3

NeuralUCB Confidence

At first, thank you a lot for your contributions. They are very valuable to improve my understanding of the original paper. I have a fundamental question regarding the implementation of the [Neural…

kaposnick updated 2 years ago
3

上一页 1...7 8 9 10 11 12 13...26 下一页

254 results for contextual-bandit

254 results
for contextual-bandit