-
**Describe the bug**
In the CustomCallback, getting the mean reward causes a numpy Error:
> TypeError: unsupported operand type(s) for /: 'str' and 'int'
The values are:
```python
x, y = t…
-
hi,thank you for your patence:
how to generate this file?
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/tensorflow1/lib/python3.7/runpy.py", line 193, in _run_module_as_mai…
-
Increase the training iterations: Train the PPO model for more iterations, as the model might not have converged yet.
Adjust the PPO hyperparameters: Experiment with different hyperparameters such …
-
[Self Imitation Learning](https://arxiv.org/abs/1806.05635)
@emrul has implemented SAIL, see https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/pull/139#issuecomment-1445114579
@em…
-
It seems to me that when HER samples an achieved goal from the replay buffer it never samples the very last state of the episode. Is this intended?
As a consequence, the sampling strategy "final" …
-
While running a TRPO train, after some time (random - anywhere from 15sec to 1min) it kicks with the following:
`Traceback (most recent call last):
File "callback.py", line 196, in
model.lea…
-
Here's an example intermittent print out from DDPG:
```
--------------------------------------
| reference_Q_mean | 49.8 |
| reference_Q_std | 6.61 |
| reference_action_m…
-
**Describe the bug**
https://stable-baselines.readthedocs.io/en/master/modules/gail.html states that GAIL supports MultiDiscrete obs space, but https://github.com/hill-a/stable-baselines/blob/maste…
-
TensorFlow takes minutes to import on a Raspberry Pi Zero W and that's probably because of the huge .so file with native primitives it has to load, among other things. Given the nature of the project,…
-
I'm trying to resume the model training and I'm getting some strange results. Using SubProcVecEnv and VecNormalize on a custom environment:
```
from stable_baselines.common.policies import MlpPoli…