-
- [ ] I have marked all applicable categories:
+ [ ] exception-raising bug
+ [ ] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [ ] ne…
-
Hi,
I have trouble finding examples using learning rate schedule with PPO2 algorithm, although it seems possible to use it :
https://github.com/hill-a/stable-baselines/blob/4a5f8d886953a94e7b0a…
-
import gym
import numpy as np
import tensorflow as tf
class Memory(object):
def __init__(self):
self.ep_obs, self.ep_act, self.ep_rwd, self.ep_neglogp = [], [], [], []…
-
@jachiam Hi! It's me again! 2 days ago I posted an issue on using multiple cpu on ExperimentGrid that seems to only give the wrong log when run in Pycharm but fine in terminal. I did some more experim…
-
# Problem description
The translated code is not working in when eager execution (default in tf2) is enabled. I thas similar behaviours as the PyTorch code. I will, therefore need to compare the tw…
-
[07-08 00:22:31 MainThread @logger.py:224] Argv: D:/Envs/SmartCar/DDPG/train.py
C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\importlib\_bootstrap.py:219: RuntimeWarning: numpy.uf…
-
- [ ] I have marked all applicable categories:
+ [ ] exception-raising bug
+ [ ] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [ ] ne…
-
Hi, I participate in [this challenge](https://www.aicrowd.com/challenges/neurips-2020-procgen-competition), which requires using `ray[rllib]==0.8.6`.
I've implemented an algorithm and it works wit…
-
你好 我想在parl 基础上设计其他的方法 其中model 里面有了 action网络和critic网络
我还想再弄一个predict网络 我增加后,在执行目标网络 到当前网络复制函数时会报错
![image](https://user-images.githubusercontent.com/46389180/88387895-09def400-cde6-11ea-8bbe-43303f…
-
In the original paper of IMPALA, the state value estimation and the action were the output of the same net, and the net was updated with the sum of three losses , which is not usual in the actor-criti…