policy-gradient Search Results

1000+ results
for policy-gradient

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

facebookresearch/BenchMARL #144

[Question] Issue in training VMAS simple spread with MAPPO a…

Hello, I'm trying to train a policy for VMAS simple spread environment using MAPPO and IPPO in Benchmarl. However, I'm suffering with some issues while training and it would be great if I can get an…

kh-ryu updated 3 days ago
1
rlpy/rlpy #7

Including Policy Gradient Techniques

# Pierre-Luc Bacon The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular…

alborzgeramifard updated 8 years ago
6
PaddlePaddle/models #1373

run policy_gradient failed

WARNING: Logging before InitGoogleLogging() is written to STDERR I1022 16:02:46.179663 104725 init.cc:47] Init commandline: dummy run.py --tryfromenv=use_pinned_memory,check_nan_inf,benchmark,warpctc…

wangsouc updated 6 years ago
1
renfujiwara/survey #33

Deterministic Policy Gradient Algorithms

https://scholar.google.com/scholar?hl=ja&as_sdt=0%2C5&q=Deterministic+Policy+Gradient+Algorithms&btnG=

renfujiwara updated 3 years ago
1
rl-tokyo/survey #10

Deterministic Policy Gradient Algorithms

http://proceedings.mlr.press/v32/silver14.pdf

sotetsuk updated 7 years ago
1
LeungSamWai/Finite-expression-method #2

policy gradient did not work

Thank you for your great work! I refactored the code [repo is here](https://github.com/baichen99/Finite-expression-method/blob/main/train_fex_possion.py), but it seems that the use of policy gradie…

baichen99 updated 1 year ago
2
openai/spinningup #424

`BUG` VPG: `pi_info_old` seems to be the same as `pi_info`

- refering to [this part](https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/vpg/vpg.py#L240) from VPG ```py # Get loss and info values before update pi_l_old…

tesla-cat updated 1 month ago
1
google/brax #328

assert_is_replicated in Analytic policy gradients training

When I try to use a 4-gpus machine to run the Analytic policy gradients training in parallel, it reports an AssertionError in `brax/training/agents/apg/train.py` line 255. Seems that it is because `t…

wangyian-me updated 3 months ago
9
yanji84/deep-recurrent-attention-model #1

Policy gradients bug in LocationNet ?

Hi @yanji84, first of all compliments on your code, the clear structure makes it easy to understand. However, I think there are two issues with how you compute the policy gradients in the `backward…

PrincipalComponent-zz updated 7 years ago
2
Duane321/reinforcement_learning_for_rideshare_pricing #4

The Reinforcement Learning

The pricing policy has parameters $\theta$s, and our goal is to optimized the simulation in order to produce max profits. To do so, we need to calculate gradient of objective function(profit) w.r…

liux3372 updated 3 months ago
3

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for policy-gradient

1000+ results
for policy-gradient