-
Dear,
the bug is hard to reproduce because it is caused by numerical issues that happen only when the underlying neural network learns parameters that are too big for a DiagGaussian action distribu…
-
This is definitely a nice to have -- It'd be nice if as a part of snapshots, we could have the option to record the policy being rolled out deterministically for a few rollouts, so that we can watch i…
-
Out of curiosity today, I looked into hyper-parameter tuning. For this the Optuna package seems like the way to go. In my first pass, I altered some of the code that stable-baselines used for tuning i…
-
Hello,
I've tried in vain to find suitable hyperparameters for SAC in order to solve MountainCarContinuous-v0.
Even with hyperparameter tuning (see "add-trpo" branch of [rl baselines zoo](https:…
-
I am curious in your utils.py where the calculate_gaussian_log_prob(log_std, noise) function came from? It doesn't look like the stable baselines or pytorch Log PDF of the normal distribution. So what…
-
Implement the main agent with Trust Region Policy Optimization (TRPO, see [Link](https://arxiv.org/abs/1502.05477))
- [x] Set up InvertedPendulum environment in OpenAI Gym
- [x] Set up neural net an…
-
In a first test I created a new venv, then pip install -r requirements.txt,
then python ./train_dqn_agent.py.
I get:
ModuleNotFoundError: No module named 'tensorflow.contrib'
Does this mean only…
-
Hi, installed the plugin incl. code environment successfully. However, if I run it I get the following error:
[14:39:43] [INFO] [dku.utils] - *************** Recipe code failed **************
[14…
-
Należy wziąć rlliba/stable baselines/inna biblioteka z algorytmami i wrzucić do repo kod, który uruchami jakieś PPO albo A2C na środowisku CartPole z gyma. Kolejny krok to sprawdzenie jak w tej biblio…
-
**Describe the question**
As far as I understand, when using a GPU, `SubprocVecEnv` runs multiple workers each running their own environment on a GPU and then updates the model when it has gathered a…