policy-learning Search Results

1000+ results
for policy-learning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

xin-pu/DeepSharp #8

强化学习 or Reinforcement Learning

# WIP: English version using Mermaid ## policy - [ ] policy-based learning 基于策略函数的学习方法 - [ ] value-based learning 基于值函数的学习方法 - [x] 动态规划学习方法 (Dynamic programming learning) -…

GeorgeS2019 updated 1 year ago
7
janeljs/janeljs.github.io #208

algorithms/programmers-%EB%B0%A9%EB%AC%B8-%EA%B8%B8%EC%9D%B4…

# Home | Jane's PS Blog Jane의 PS 블로그 [https://janeljs.github.io/algorithms/programmers-%EB%B0%A9%EB%AC%B8-%EA%B8%B8%EC%9D%B4/](https://janeljs.github.io/algorithms/programmers-%EB%B0%A9%EB%AC%B8-%EA…

utterances-bot updated 4 days ago
2
catboost/catboost #2747

Process fails in Kaggle (P100/T4x2)

Problem: Process fails catboost version: 1.2.7 Operating System: ??? (Kaggle) CPU: ??? (Kaggle) GPU: P100 / T4x2 Params: {'learning_rate': 0.270640171567353, 'iterations': 1100, 'depth': 8, 'l…

EpicUsaMan updated 2 months ago
2
facebookresearch/BenchMARL #143

Parallel collection and evaluation

Can _evaluation_loop use SyncDataCollector for non vectorized envs so that the evaluation is also parallel? While running on Melting Pot envs, increasing n_envs_per_worker definitely improves execu…

gliese876b updated 2 days ago
9
Azure/AKS #2125

Vertical Scaling for AKS Managed Addons (CoreDns & Retina)

**What happened**: Many AKS maintained Pods are running with memory overcommitment, eg: - omsagent (Daemonset; up to 375 MB) - coredns (Deployment, up to 100MB) - kube-proxy (Daemonset; unlimite…

mblaschke updated 1 week ago
74
geopm/geopm #747

Accept region frequencies in EE agent policies

In order to synchronize learning between EE agents, there needs to be a way to tell an agent about something that has already been learned by one of its parents. Consider using the following approach:…

dannosliwcd updated 3 months ago
1
rasbt/machine-learning-book #189

Chapter 19, Reinforcement Learning, p. 691

I would like to get a slightly better understanding regarding the difference between the on-policy and off-policy as well as some clarifications regarding the formulas used to apply them. Namely, what…

Maryisme updated 3 months ago
2
accessframework/RAGent #1

About the dataset used in training policy generator

`@click.command() @click.option('--train_path', help='Huggingface dataset name', required=True ) @click.option('--out_dir', default='../checkpoints/', help='Output directory'…

eruka-w updated 1 month ago
2
hajisho/world_model2022_group22 #28

Flood-Fill Q-Learning Updates for Learning Redundant Policie…

bishopfunc updated 1 year ago
1
UniversalDependencies/docs #864

Script for reported speech re-annotation

@nschneid wrote the other day regarding the analysis of reported speech in response to: @MagaliDuran, that "the policy was recently changed but not fully updated in the guidelines". This took me b…

LarsAhrenberg updated 2 weeks ago
9

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for policy-learning

1000+ results
for policy-learning