-
The TD3 loss according to OpenAI Spinning up is:
![L1](https://spinningup.openai.com/en/latest/_images/math/7d5c18f49a242cc3eec554f717fe4f3bfc119bab.svg)
![L2](https://spinningup.openai.com/en/lat…
-
- [X] I have marked all applicable categories:
+ [X] exception-raising bug
+ [X] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [ ] ne…
-
Good day. I'm trying zoo now on a custom environment, and I'm getting a couple of questions.
- There are many trials that finished with the exact same value, and there's more than 1 instance of tha…
-
This is rather minor, but polyak averaging in DQN/SAC/TD3 could be done faster with far fewer intermediate tensors using `torch.addcmul_` https://pytorch.org/docs/stable/torch.html#torch.addcmul.
m-rph updated
4 years ago
-
Hello,
I'm using TD3 for training a MLP policy in a custom environment and I would like to know what is the appropriate way to remove the bias from the neural network model, since I would like to h…
-
After cloning the rl-baselines3-zoo, I was trying to train my own agent.
By :
**python train.py --algo algo_name --env env_id**
After that, I used
**python enjoy.py --algo td3 --env AntBulletEnv-v…
-
**[OpenAI Baselines](https://github.com/openai/baselines)** is a set of high-quality implementations of reinforcement learning algorithms. These algorithms make it easier for the research community to…
-
After cloning the rl-baselines3-zoo, I was trying to train my own agent.
By :
**python train.py --algo algo_name --env env_id**
After that, I used
**python enjoy.py --algo algo_name --env env_id…
-
Hi, I am using following code to resume an interrupted training using TD3.load(). However, the training speed is much slower than before. Here environment is the same as before. Each episode (with th…
-
First: I'm very happy to see the new PyTorch SB3 version! Great job!
My question is whether pretraining-support is planned for SB3 (like for SB: https://stable-baselines.readthedocs.io/en/master/g…