contextual-bandit Search Results

Alanthink/banditpylib #15

Contextual Bandits?

Is contextual bandits in the scope of this library? Here is a paper for reference: https://arxiv.org/pdf/1810.09558.pdf

htcml updated 3 years ago

microsoft/learning-loop #13

Basic documentation

Curious why this serve RL and how is this RL related to known RL framework e.g. SB3

GeorgeS2019 updated 2 weeks ago

tensorflow/agents #791

Contextual Bandit Off-Policy Evaluation

Hi, I am currently dealing with "agents/tf_agents/bandits/" . I am wondering where or if the classic Contextual Bandit off-policy evaluation procedures are present in Tensorflow.I mean exactly the…

vitorkrasniqi updated 1 year ago

VowpalWabbit/vowpal_wabbit #1542

progressive validation with Contextual Bandits

Hi, There are several usage questions with Contextual Bandits, that I'd be happy to incorporate in the Wiki and stackoverflow. 2. is progressive validation applied when training with IPS? I'm no…

matanox updated 4 years ago

py-why/EconML #266

Recast contextual bandit problem as causal inference

I have a contextual bandit problem with (S - state, A - action, R - reward) where S is high-dimensional vector, A is continuous value, R is continuous value, how do I learn optimal mapping function fr…

JunhaoWang updated 4 years ago

VowpalWabbit/vowpal_wabbit #4634

Contextual Bandit vowpal_wabbit training dataset validation

I am currently using the Vowpal Wabbit package in order to implement a Contextual Bandit use case. My use case is to provide categories(L1/L2/L3/L4/L5) considered action here with personalized rankin…

pallavi080596 updated 1 year ago

VowpalWabbit/vowpal_wabbit #2023

Modularity problems in the contextual bandit stack

Breaking out the remaining work from https://github.com/VowpalWabbit/vowpal_wabbit/issues/1782 --cb_explore_adf should not need to know about the cb_type since that isn't used except in the cb_adf …

peterychang updated 3 years ago

SforAiDl/genrl #314

Evaluating performance of contextual bandit agents in exampl…

I have been playing around with the DCBTrainer and found some potential inconsistencies. 1) **StatlogData** example found [here](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/contex…

TMorville updated 4 years ago

eHarmony/aloha #105

Add support for action dependent contextual bandit models

Currently, there is support for contextual bandit models in Aloha which necessitates the set of actions to be constant. This ticket is to add support for contextual bandit models with action dependent…

sahil-goyal updated 8 years ago

VowpalWabbit/vowpal_wabbit #4495

Inconsistent and unclear loss calculation for contextual ban…

The loss calculation for CB reductions is not consistent and not well documented. The current situation is: - `cb_adf` records loss as calculated by an IPS estimator, except for if CB type DR or DM…

jackgerrits updated 1 year ago

254 results for contextual-bandit

254 results
for contextual-bandit