-
Hello,
I'm trying to create a strong Mancala bot. I chose Q-learning:
`# Let's do independent Q-learning in Mancala, and play it against random.
# RL is based on python/examples/independent_tabular…
-
First of all: Sorry if this doesn't belong here. I'll post this on the stable-baselines3 github if so.
Hello I'm a beginner and I'm facing this problem where I cant load the saved DQN model. I trai…
fqidz updated
8 months ago
-
## Motivation
It is hard to follow and understand the example and tutorials. As an example, if I compare the 2 flavors of cartpole PyTorch code, the one from PyTorch pytorch/tutorial is far eas…
-
hello. I'm running this code and I have some questions.
################################################################################
AutoPentest-DRL: Automated Penetration Testing Using Deep Re…
-
The standard protocol in the DQN Atari Nature paper is to bin the rewards and not actually clip them, although typically called "reward clipping".
See https://github.com/openai/baselines/blob/ea25b9e…
-
### The Error Message:
logger.warn(f"{pre} is not within the observation space.")
Traceback (most recent call last):
File "E:\PongDQN_RL\02_dqn_pong.py", line 141, in
agent = Agent(env, b…
-
### 🐛 Bug
Whenever a custom environment with discrete action space that does not start in 0 is used DQN crashes due to an index error.
### Code example
```python
import gymnasium as gym
import nu…
-
Hello,
im working on a game implementation and want to train an Agent for the Game with the AlphaZero approach.
I managed to compile and run tests with the workaround described https://github.com/…
-
### 🐛 Bug
### To Reproduce
```python
import gymnasium as gym
from sbx import DDPG, DQN, PPO, SAC, TD3, TQC, DroQ
env = gym.make("Pendulum-v1")
model = TQC("MlpPolicy", env, verbose…
-
### 🐛 Bug
The model training never uses GPU vram and instead depends only on cpu which is slowing down the training. I did set the device as `cuda` in the DQN model training.
### Code example
```p…