Stable-Baselines-Team stable-baselines3-contrib issues

Stable-Baselines-Team / stable-baselines3-contrib

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

https://sb3-contrib.readthedocs.io

MIT License

504 stars 175 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Question] Not updating lstm states during training

#265 abhinavj98 opened 5 days ago
0
Add missing condition in CI

#264 araffin closed 1 week ago
0
Drop python 3.8, add python 3.12 support

#263 araffin closed 1 week ago
0
Release v2.4.0

#262 araffin closed 1 week ago
0
Add support for gymnasium v1.0

#261 araffin closed 3 weeks ago
0
Update deps for read the doc

#260 araffin closed 3 weeks ago
0
Fix QRDQN loading `target_update_interval`

#259 jak3122 closed 1 month ago
0
[Bug]: loading QRDQN changes target_update_interval

#258 jak3122 closed 1 month ago
0
[Question] Why can't directly use the PPO (RecurrentActorCriticPolicy, "CartPole - v1", verbose = 1)

#257 dajianer opened 2 months ago
1
[Bug]: Is sb3_contrib/common/maskable/utils.py the cause of "WARN: env.action_masks to get variables from other wrappers is deprecated and will be removed in v1.0"?

#256 mkbg8 opened 3 months ago
1
Fix warning when loading a `RecurrentPPO` model

#255 araffin closed 3 months ago
0
[Bug]: FutureWarning: You are using `torch.load` with `weights_only=False`

#254 drulye closed 3 months ago
3
[Feature Request] same random seed for every env in AsyncEval

#253 1-Bart-1 opened 4 months ago
1
Update QR-DQN optimizer to only use q_net parameters

#252 corentinlger closed 4 months ago
1
Update SB3 and remove gSDE resampling

#251 araffin closed 5 months ago
0
[Question] Masked actions PPO in multiagent setting using PettigZoo

#250 MarcoPicione opened 5 months ago
0
[Question] Apply Masking using ActionMasker on composite actions

#249 mwalidcharrwi closed 5 months ago
4
[Question] How to do pre-training on the RecurrentPPO MlpLstmPolicy

#248 iwishiwasaneagle opened 6 months ago
1
MaskablePPO Masking Doesn't Work with Big Action Space

#247 orkunkn closed 6 months ago
4
RecurrentActorCriticPolicy Behaviour Not Clear

#246 pasinit opened 6 months ago
1
TQC: ep_len_mean and ep_rew_mean does not match real values

#245 btabia opened 6 months ago
0
ep_len_mean discrepancy

#244 btabia closed 6 months ago
0
Implemented CrossQ

#243 danielpalen closed 1 month ago
11
Dependent Actions in MultiDiscrete Action Space

#242 bbarisbaturay opened 6 months ago
5
[Question] Recurrent Maskable PPO ?!? Rudder ?!?

#241 tty666 closed 7 months ago
1
[Question] What is the difference between old_distribution and distribution in train function of TRPO

#240 0Addicted0 closed 6 months ago
2
[Question] RecurrentPPO: Reset LSTM states early?

#239 phisad opened 7 months ago
3
[Feature Request] Implement CrossQ

#238 danielpalen closed 1 month ago
0
Fix typo in changelog

#237 araffin closed 7 months ago
0
Release v2.3.0

#236 araffin closed 8 months ago
0
Log success rate for PPO variants

#235 araffin closed 8 months ago
0
[Question] Why does MaskablePPO does not mask with some logic with last observation?

#234 EloyAnguiano opened 8 months ago
4
Fix PPO maskable type annotations

#233 araffin closed 8 months ago
0
Update ruff and SB3 dependencies

#232 araffin closed 8 months ago
0
[Question] Simple way to implement data augmentation when training agent

#231 thomashirtz closed 9 months ago
2
[Question] LSTM observations

#230 suargi closed 10 months ago
3
Fix `train_freq` type annotation for TQC and QR-DQN

#229 Armandpl closed 10 months ago
0
Episodic training with TQC?

#228 Armandpl closed 10 months ago
2
Add note about MaskableEvalCallback

#227 icheered closed 10 months ago
0
EvalCallback crashes Maskable PPO without error

#226 icheered closed 10 months ago
3
Update QRDQN defaults

#225 araffin closed 10 months ago
0
Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper

#224 vladyskai opened 10 months ago
1
[Feature Request] STAC algorithm

#223 EloyAnguiano opened 10 months ago
4
[Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training

#222 DeepRowLie closed 10 months ago
2
[Bug]: producing NAN values during training in MaskablePPO

#221 vahidqo opened 11 months ago
5
[Feature Request] Expand RNN Options and Algorithm Flexibility

#220 mtnusf97 opened 11 months ago
2
Update `_process_sequence()` docstring

#219 rogerioagjr closed 11 months ago
0
[Question] Recurrent PPO evaluation

#218 CAI23sbP closed 1 year ago
2
Release v2.2.1: hotfix file closing

#217 araffin closed 1 year ago
0
Release v2.2.0

#216 araffin closed 1 year ago
0