-
Hello!
There is a limit to the size of the problem over which a crash occurs (segmentation fault) when its trying to solve. I created a (not so) minimal example, but it is not the only one that cau…
-
hi
My reward stays at 300 after 100w steps whereas yours increases almost linearly. My actor loss is between -4.5 and -0.5 while my critic loss is between 0 and 0.06, which is way much smaller …
-
**Describe the bug**
Loading a DDPG agent when trained using normalized observation or normalized returns does not work. The trained agent does not have the correct critic and the correct policy. Thi…
-
**Describe the bug**
ModuleNotFoundError: No module named 'stable_baselines.ddpg.memory', when loading ddpg pendulum-v0
**Code example**
```python
from utils import ALGOS
folder = "trained_agen…
ghost updated
5 years ago
-
When I try to run the example DDPG test and eval scripts, I get the following issue:
module 'tensorflow' has no attribute 'enable_resource_variables'
If I change this line to:
tf.compat.v1.enable…
-
Hi, I've been working with SAC up to now and it works fine with my custom environment. However, I wanted to test some other algorithms but I run into the following error that I have not really any ide…
-
I'd like to implement Hindsight Experience Replay (HER). This can be based on a whatever goal-parameterized RL off-policy algorithm.
**Goal-parameterized architectures**: it requires a variable for…
-
I'm training a PGTrainer using Pytorch and tune.run with the following command:
tune.run(pg.PGTrainer,
local_dir=".", stop={"episode_reward_mean": 0.5},
resourc…
-
I wanted to propose adding a pendulum experiment to bsuite. I think it fits the targeted, simple, challenging, scalable, fast criteria outlined in the bsuite paper. Also, now that https://github.com…
-
Hi,
I find it difficult to figure out working parameters for SAC. Is there just some standard examples like e.g. in the original softlearning environment?
kapsl updated
4 years ago