issues
search
Stable-Baselines-Team
/
stable-baselines3-contrib
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
504
stars
175
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Question] Not updating lstm states during training
#265
abhinavj98
opened
5 days ago
0
Add missing condition in CI
#264
araffin
closed
1 week ago
0
Drop python 3.8, add python 3.12 support
#263
araffin
closed
1 week ago
0
Release v2.4.0
#262
araffin
closed
1 week ago
0
Add support for gymnasium v1.0
#261
araffin
closed
3 weeks ago
0
Update deps for read the doc
#260
araffin
closed
3 weeks ago
0
Fix QRDQN loading `target_update_interval`
#259
jak3122
closed
1 month ago
0
[Bug]: loading QRDQN changes target_update_interval
#258
jak3122
closed
1 month ago
0
[Question] Why can't directly use the PPO (RecurrentActorCriticPolicy, "CartPole - v1", verbose = 1)
#257
dajianer
opened
2 months ago
1
[Bug]: Is sb3_contrib/common/maskable/utils.py the cause of "WARN: env.action_masks to get variables from other wrappers is deprecated and will be removed in v1.0"?
#256
mkbg8
opened
3 months ago
1
Fix warning when loading a `RecurrentPPO` model
#255
araffin
closed
3 months ago
0
[Bug]: FutureWarning: You are using `torch.load` with `weights_only=False`
#254
drulye
closed
3 months ago
3
[Feature Request] same random seed for every env in AsyncEval
#253
1-Bart-1
opened
4 months ago
1
Update QR-DQN optimizer to only use q_net parameters
#252
corentinlger
closed
4 months ago
1
Update SB3 and remove gSDE resampling
#251
araffin
closed
5 months ago
0
[Question] Masked actions PPO in multiagent setting using PettigZoo
#250
MarcoPicione
opened
5 months ago
0
[Question] Apply Masking using ActionMasker on composite actions
#249
mwalidcharrwi
closed
5 months ago
4
[Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy
#248
iwishiwasaneagle
opened
6 months ago
1
MaskablePPO Masking Doesn't Work with Big Action Space
#247
orkunkn
closed
6 months ago
4
RecurrentActorCriticPolicy Behaviour Not Clear
#246
pasinit
opened
6 months ago
1
TQC: ep_len_mean and ep_rew_mean does not match real values
#245
btabia
opened
6 months ago
0
ep_len_mean discrepancy
#244
btabia
closed
6 months ago
0
Implemented CrossQ
#243
danielpalen
closed
1 month ago
11
Dependent Actions in MultiDiscrete Action Space
#242
bbarisbaturay
opened
6 months ago
5
[Question] Recurrent Maskable PPO ?!? Rudder ?!?
#241
tty666
closed
7 months ago
1
[Question] What is the difference between old_distribution and distribution in train function of TRPO
#240
0Addicted0
closed
6 months ago
2
[Question] RecurrentPPO: Reset LSTM states early?
#239
phisad
opened
7 months ago
3
[Feature Request] Implement CrossQ
#238
danielpalen
closed
1 month ago
0
Fix typo in changelog
#237
araffin
closed
7 months ago
0
Release v2.3.0
#236
araffin
closed
8 months ago
0
Log success rate for PPO variants
#235
araffin
closed
8 months ago
0
[Question] Why does MaskablePPO does not mask with some logic with last observation?
#234
EloyAnguiano
opened
8 months ago
4
Fix PPO maskable type annotations
#233
araffin
closed
8 months ago
0
Update ruff and SB3 dependencies
#232
araffin
closed
8 months ago
0
[Question] Simple way to implement data augmentation when training agent
#231
thomashirtz
closed
9 months ago
2
[Question] LSTM observations
#230
suargi
closed
10 months ago
3
Fix `train_freq` type annotation for TQC and QR-DQN
#229
Armandpl
closed
10 months ago
0
Episodic training with TQC?
#228
Armandpl
closed
10 months ago
2
Add note about MaskableEvalCallback
#227
icheered
closed
10 months ago
0
EvalCallback crashes Maskable PPO without error
#226
icheered
closed
10 months ago
3
Update QRDQN defaults
#225
araffin
closed
10 months ago
0
Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper
#224
vladyskai
opened
10 months ago
1
[Feature Request] STAC algorithm
#223
EloyAnguiano
opened
10 months ago
4
[Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training
#222
DeepRowLie
closed
10 months ago
2
[Bug]: producing NAN values during training in MaskablePPO
#221
vahidqo
opened
11 months ago
5
[Feature Request] Expand RNN Options and Algorithm Flexibility
#220
mtnusf97
opened
11 months ago
2
Update `_process_sequence()` docstring
#219
rogerioagjr
closed
11 months ago
0
[Question] Recurrent PPO evaluation
#218
CAI23sbP
closed
1 year ago
2
Release v2.2.1: hotfix file closing
#217
araffin
closed
1 year ago
0
Release v2.2.0
#216
araffin
closed
1 year ago
0
Next