issues
search
Stable-Baselines-Team
/
stable-baselines3-contrib
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
465
stars
173
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix QRDQN loading `target_update_interval`
#259
jak3122
closed
15 hours ago
0
[Bug]: loading QRDQN changes target_update_interval
#258
jak3122
closed
15 hours ago
0
[Question] Why can't directly use the PPO (RecurrentActorCriticPolicy, "CartPole - v1", verbose = 1)
#257
dajianer
opened
1 month ago
1
[Bug]: Is sb3_contrib/common/maskable/utils.py the cause of "WARN: env.action_masks to get variables from other wrappers is deprecated and will be removed in v1.0"?
#256
mkbg8
opened
1 month ago
1
Fix warning when loading a `RecurrentPPO` model
#255
araffin
closed
1 month ago
0
[Bug]: FutureWarning: You are using `torch.load` with `weights_only=False`
#254
drulye
closed
1 month ago
3
[Feature Request] same random seed for every env in AsyncEval
#253
1-Bart-1
opened
2 months ago
1
Update QR-DQN optimizer to only use q_net parameters
#252
corentinlger
closed
2 months ago
1
Update SB3 and remove gSDE resampling
#251
araffin
closed
3 months ago
0
[Question] Masked actions PPO in multiagent setting using PettigZoo
#250
MarcoPicione
opened
3 months ago
0
[Question] Apply Masking using ActionMasker on composite actions
#249
mwalidcharrwi
closed
3 months ago
4
[Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy
#248
iwishiwasaneagle
opened
4 months ago
0
MaskablePPO Masking Doesn't Work with Big Action Space
#247
orkunkn
closed
4 months ago
1
RecurrentActorCriticPolicy Behaviour Not Clear
#246
pasinit
opened
4 months ago
1
TQC: ep_len_mean and ep_rew_mean does not match real values
#245
btabia
opened
4 months ago
0
ep_len_mean discrepancy
#244
btabia
closed
4 months ago
0
Implemented CrossQ
#243
danielpalen
opened
5 months ago
10
Dependent Actions in MultiDiscrete Action Space
#242
bbarisbaturay
opened
5 months ago
5
[Question] Recurrent Maskable PPO ?!? Rudder ?!?
#241
tty666
closed
5 months ago
1
[Question] What is the difference between old_distribution and distribution in train function of TRPO
#240
0Addicted0
closed
5 months ago
2
[Question] RecurrentPPO: Reset LSTM states early?
#239
phisad
opened
6 months ago
3
[Feature Request] Implement CrossQ
#238
danielpalen
opened
6 months ago
0
Fix typo in changelog
#237
araffin
closed
6 months ago
0
Release v2.3.0
#236
araffin
closed
6 months ago
0
Log success rate for PPO variants
#235
araffin
closed
6 months ago
0
[Question] Why does MaskablePPO does not mask with some logic with last observation?
#234
EloyAnguiano
opened
6 months ago
4
Fix PPO maskable type annotations
#233
araffin
closed
6 months ago
0
Update ruff and SB3 dependencies
#232
araffin
closed
6 months ago
0
[Question] Simple way to implement data augmentation when training agent
#231
thomashirtz
closed
7 months ago
2
[Question] LSTM observations
#230
suargi
closed
8 months ago
3
Fix `train_freq` type annotation for TQC and QR-DQN
#229
Armandpl
closed
8 months ago
0
Episodic training with TQC?
#228
Armandpl
closed
8 months ago
2
Add note about MaskableEvalCallback
#227
icheered
closed
8 months ago
0
EvalCallback crashes Maskable PPO without error
#226
icheered
closed
8 months ago
3
Update QRDQN defaults
#225
araffin
closed
8 months ago
0
Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper
#224
vladyskai
opened
8 months ago
1
[Feature Request] STAC algorithm
#223
EloyAnguiano
opened
9 months ago
4
[Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training
#222
DeepRowLie
closed
8 months ago
2
[Bug]: producing NAN values during training in MaskablePPO
#221
vahidqo
opened
9 months ago
5
[Feature Request] Expand RNN Options and Algorithm Flexibility
#220
mtnusf97
opened
9 months ago
2
Update `_process_sequence()` docstring
#219
rogerioagjr
closed
10 months ago
0
[Question] Recurrent PPO evaluation
#218
CAI23sbP
closed
10 months ago
2
Release v2.2.1: hotfix file closing
#217
araffin
closed
10 months ago
0
Release v2.2.0
#216
araffin
closed
10 months ago
0
Remove PyType and upgrade to latest SB3 version
#215
araffin
closed
10 months ago
0
Add rollout_buffer_class to TRPO
#214
ernestum
closed
11 months ago
2
Sync SB3 Contrib with SB3
#213
araffin
closed
11 months ago
0
Predicting actions after using MaskablePPO model outputs invalid action
#212
vivek-kumar9696
closed
11 months ago
2
Recurrent PPO Not Training Well on a Very Simple Environment
#211
sreejank
opened
11 months ago
0
Worse training with Vectorized Environment
#210
pklochowicz
closed
8 months ago
0
Next