proximal-policy-optimization Search Results

173 results
for proximal-policy-optimization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

kuto5046/papers #19

Proximal Policy Optimization Algorithms

Schulman, John et al. https://arxiv.org/abs/1707.06347

kuto5046 updated 4 years ago
2
facebookresearch/ParlAI #2998

[Blender] How to calculate loss from Blender?

Hello, I have a subclassed blender agent: ```python class Blender(TransformerGeneratorAgent): ``` I first generated a bunch of sentences: `["hey, how are you?", "how's it going..?"]` **H…

josharnoldjosh updated 4 years ago
2
ray-project/ray #7318

Error Restoring PPO model with Custom Head

### What is the problem? I have used tune to optimize and train PPO with a parametric head similar to the example seen [here ](https://github.com/ray-project/ray/blob/master/rllib/examples/parame…

RaedShabbir updated 4 years ago
6
tensorflow/tensorflow #42693

proximal policy gradient tensorflow pendulum issue

import gym import numpy as np import tensorflow as tf class Memory(object): def __init__(self): self.ep_obs, self.ep_act, self.ep_rwd, self.ep_neglogp = [], [], [], []…

EpicSpaces updated 4 years ago
3
tensorflow/agents #139

Request: Exploration by Random Network Distillation (modifie…

A gentle request for a TF-Agents implementation of a modified PPO with an exploration bonus - for testing on Montezuma's Revenge. Paper: [Exploration by Random Network Distillation](https://arxiv.o…

8bitmp3 updated 4 years ago
2
mlpack/mlpack #2363

Implementation of Policy Gradient Methods

In the reinforcement learning module, we already have a value-based implementation which involves methods like q_learning and the greedy policy. We could now move to policy optimization . Below are a …

abinezer updated 4 years ago
4
brainglobe/cellfinder #174

[BUG]cellfinder training --cellfinder==0.4.12

**Describe the bug** when I entered the command 'cellfinder_train -y D:\AM_tiff\output\points\training.yml -o D:\AM_tiff\output\points' there is an error ''cellfinder_train' is not recognized as an …

cindy12-gao updated 3 years ago
16
LuEE-C/PPO-Keras #8

Implementation of PPO loss

In your implementation of the PPO loss, do you not need to collapse both `prob` and `old_prob` down to a single scalar per row, instead of a vector with a single non-zero entry? Otherwise, it seems th…

davidADSP updated 4 years ago
1
LuEE-C/PPO-Keras #6

math behind continuous loss function

``` def proximal_policy_optimization_loss_continuous(advantage, old_prediction): def loss(y_true, y_pred): var = K.square(NOISE) pi = 3.1415926 denom = K.sqrt(2 * pi *…

nyck33 updated 4 years ago
2
ashhitch/wp-graphql-yoast-seo #28

Problem returning excerpt in meta Desc

Using %%excerpt%% as the default description on a custom post type don't return the generated value. In editor & WP frontend it works ![image](https://user-images.githubusercontent.com/2171273/806518…

Mindgames updated 4 years ago
8

上一页 1...12 13 14 15 16 17 18...18 下一页

173 results for proximal-policy-optimization

173 results
for proximal-policy-optimization