issues
search
Stable-Baselines-Team
/
stable-baselines3-contrib
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
442
stars
166
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update SB3 and remove gSDE resampling
#251
araffin
closed
3 days ago
0
[Question] Masked actions PPO in multiagent setting using PettigZoo
#250
MarcoPicione
opened
1 week ago
0
[Question] Apply Masking using ActionMasker on composite actions
#249
mwalidcharrwi
closed
1 week ago
4
[Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy
#248
iwishiwasaneagle
opened
1 month ago
0
MaskablePPO Masking Doesn't Work with Big Action Space
#247
orkunkn
closed
1 month ago
1
RecurrentActorCriticPolicy Behaviour Not Clear
#246
pasinit
opened
1 month ago
1
TQC: ep_len_mean and ep_rew_mean does not match real values
#245
btabia
opened
1 month ago
0
ep_len_mean discrepancy
#244
btabia
closed
1 month ago
0
Implemented CrossQ
#243
danielpalen
opened
1 month ago
8
Dependent Actions in MultiDiscrete Action Space
#242
bbarisbaturay
opened
2 months ago
1
[Question] Recurrent Maskable PPO ?!? Rudder ?!?
#241
tty666
closed
2 months ago
1
[Question] What is the difference between old_distribution and distribution in train function of TRPO
#240
0Addicted0
closed
2 months ago
2
[Question] RecurrentPPO: Reset LSTM states early?
#239
phisad
opened
2 months ago
3
[Feature Request] Implement CrossQ
#238
danielpalen
opened
3 months ago
0
Fix typo in changelog
#237
araffin
closed
3 months ago
0
Release v2.3.0
#236
araffin
closed
3 months ago
0
Log success rate for PPO variants
#235
araffin
closed
3 months ago
0
[Question] Why does MaskablePPO does not mask with some logic with last observation?
#234
EloyAnguiano
opened
3 months ago
4
Fix PPO maskable type annotations
#233
araffin
closed
3 months ago
0
Update ruff and SB3 dependencies
#232
araffin
closed
3 months ago
0
[Question] Simple way to implement data augmentation when training agent
#231
thomashirtz
closed
4 months ago
2
[Question] LSTM observations
#230
suargi
closed
5 months ago
3
Fix `train_freq` type annotation for TQC and QR-DQN
#229
Armandpl
closed
5 months ago
0
Episodic training with TQC?
#228
Armandpl
closed
5 months ago
2
Add note about MaskableEvalCallback
#227
icheered
closed
5 months ago
0
EvalCallback crashes Maskable PPO without error
#226
icheered
closed
5 months ago
3
Update QRDQN defaults
#225
araffin
closed
5 months ago
0
Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper
#224
vladyskai
opened
5 months ago
1
[Feature Request] STAC algorithm
#223
EloyAnguiano
opened
6 months ago
4
[Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training
#222
DeepRowLie
closed
5 months ago
2
[Bug]: producing NAN values during training in MaskablePPO
#221
vahidqo
opened
6 months ago
5
[Feature Request] Expand RNN Options and Algorithm Flexibility
#220
mtnusf97
opened
6 months ago
2
Update `_process_sequence()` docstring
#219
rogerioagjr
closed
7 months ago
0
[Question] Recurrent PPO evaluation
#218
CAI23sbP
closed
7 months ago
2
Release v2.2.1: hotfix file closing
#217
araffin
closed
7 months ago
0
Release v2.2.0
#216
araffin
closed
7 months ago
0
Remove PyType and upgrade to latest SB3 version
#215
araffin
closed
7 months ago
0
Add rollout_buffer_class to TRPO
#214
ernestum
closed
8 months ago
2
Sync SB3 Contrib with SB3
#213
araffin
closed
8 months ago
0
Predicting actions after using MaskablePPO model outputs invalid action
#212
vivek-kumar9696
closed
8 months ago
2
Recurrent PPO Not Training Well on a Very Simple Environment
#211
sreejank
opened
8 months ago
0
Worse training with Vectorized Environment
#210
pklochowicz
closed
5 months ago
0
How to use LSTM ? RecurrentPPO from sb3-contrib
#209
PedroIAgithub
closed
9 months ago
6
Maskable PPO selects illegal actions, altough everything looks correct
#208
DominikRoB
closed
9 months ago
2
Decrease in reward during training with MaskablePPO
#207
vahidqo
opened
10 months ago
0
[Feature Request] BBF algorithm implementation
#206
Alian3785
opened
10 months ago
2
Speed up when using MaskablePPO
#205
vahidqo
opened
10 months ago
2
Release v2.1.0
#204
araffin
closed
10 months ago
0
SACD Discrete Soft Actor Critic
#203
splatter96
opened
11 months ago
3
[Feature Request] Hybrid PPO
#202
AlexPasqua
opened
11 months ago
0
Next