contextual-bandit Search Results

254 results
for contextual-bandit

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

fairlearn/fairlearn #756

MetricFrame should support metrics that don't require y_true…

Right now `MetricFrame` only works with metrics with the signature `metric(y_true, y_pred)`, but its disaggregation functionality should be much more broadly applicable. Two use cases of interest: …

MiroDudik updated 3 years ago
74
Stable-Baselines-Team/stable-baselines3-contrib #211

Recurrent PPO Not Training Well on a Very Simple Environment…

### 🐛 Bug I've adapted the environment from this [blog post](https://medium.com/hackernoon/learning-policies-for-learning-policies-meta-reinforcement-learning-rl%C2%B2-in-tensorflow-b15b592a2ddf), …

sreejank updated 1 month ago
1
fairlearn/fairlearn #772

Counterfactual Analysis Documentation with VW Support

**Introduction:** Hey all, I am a participant of Microsoft's RL Open Source Fest this summer and I am very excited to begin contributing to FairLearn by writing out a documentation containing counterf…

wcheung-code updated 2 years ago
5
facebookresearch/ReAgent #509

Tutorial not work

follow https://reagent.ai/rasp_tutorial.html#installing-reagent , ./reagent/workflow/cli.py run reagent.workflow.training.identify_and_train_network "$CONFIG" /home/circleci/project/ReAgent/rea…

galoisking updated 3 years ago
1
VowpalWabbit/vowpal_wabbit #2790

CB ADF (usually?) requires quadratics to work properly

Per Paul, if you use `|shared` and `|action` you need -q sa, in general you might need `-q ::` ... we could warn if there are no quadratics specified, that's probably a bug If this is the case, …

peterychang updated 3 years ago
4
wantedly/machine-learning-round-table #150

[2022/07/07]Machine Learning 輪講

## Why Machine Learning 輪講は最新の技術や論文を追うことで、エンジニアが「技術で解決できること」のレベルをあげていくことを目的にした会です。 prev. https://github.com/wantedly/machine-learning-round-table/issues/148 ## What 話したいことがある人はここにコメントしましょう…

hakubishin3 updated 2 years ago
4
VowpalWabbit/vowpal_wabbit #2635

Acceptable feature values types not well documented

### Description Looking through all of the available vw documentation, there doesn't seem to be any clear documentation on all of the valid input value types for features (at least that I can find)…

wmelton updated 4 years ago
3
le-liang/ResourceAllocationDelayedCSI #1

请教问题

作者您好，我有个问题想请教一下。强化学习中，下一时刻的state一般都是由当前时刻的state和action决定的，如果对于下一时刻状态与当前时刻的state和action无关时，强化学习方法还适用吗？比如在无线通信中，信道的变化只与时间有关，即每个时刻的信道状态都不相同，也与当前的信道状态和action（比如分配的功率）无关，请问这种情况下，强化学习方法还适用吗？

Scorpio-y updated 5 years ago
4
VowpalWabbit/vowpal_wabbit #3919

Specific Continuous Range of Actions based on context

## Short description In one of the wiki pages, it is given that "However sometimes the actions that can be taken might be dependent on the context. In this case, one can specify examples by listing t…

kkchaitu27 updated 2 years ago
9
VowpalWabbit/vowpal_wabbit #4488

Proposal: Remove the concept of "reduction_features"

Reduction features was originally added (https://github.com/VowpalWabbit/vowpal_wabbit/pull/2282) as a means to split out the contents of a label that are also required for prediction. I think this…

jackgerrits updated 1 year ago
4

上一页 1...2 3 4 5 6 7 8...26 下一页

254 results for contextual-bandit

254 results
for contextual-bandit