issues
search
DLR-RM
/
stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.85k
stars
1.68k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update documentation
#2017
Dev1nW
opened
3 days ago
0
[Bug]: obs_as_tensor RuntimeError with numpy==2.1.1 on MacOS13
#2016
nquetschlich
closed
2 days ago
2
can not load PPO model when I use custom net_arch
#2015
krishdotn1
opened
4 days ago
0
[Question] Manually Controlling Actions During PPO Training
#2014
wayne-weiwei
opened
1 week ago
1
[bug] Adaptive SAC: using logarithm of entropy coefficient to compute temperature objective instead of entropy coefficient
#2013
Mattia-sony
opened
1 week ago
1
[Feature Request] Warn users when using GPU with `A2C`/`PPO` + update documentation
#2012
jws-1
opened
1 week ago
4
[Bug]: Episode start flag is never set for off policy algorithms
#2011
josndan
opened
2 weeks ago
1
[Question] Auto-regressive manner policy network?
#2010
wadmes
opened
2 weeks ago
1
A succesfully ppo trained agent demands some steps of re-training to make good predictions
#2009
tanielsfranklin
closed
2 weeks ago
1
[Feature Request] Safe Reinforcement Learning & Multi-Objective Reinforcement Learning
#2008
cherrywoods
closed
2 weeks ago
2
[Question] MARL using Stable Baselines3
#2007
Hamza-101
closed
2 weeks ago
1
[Question] Shared feature extractor and gradient
#2006
brn-dev
closed
2 weeks ago
2
Fix tests for mps support
#2005
deathcoder
opened
2 weeks ago
2
[Question] The entropy value is a negative number, and the entropy loss is a positive number.
#2004
YSAA1
closed
1 week ago
3
Warn users when using multi-dim MultiDiscrete obs space
#2003
araffin
closed
3 weeks ago
0
[Question] Issues withe monitor not having seed parameter
#2002
AbhayGoyal
closed
2 weeks ago
2
[Bug]: bug title SubprocVecEnv TypeError: reset() got an unexpected argument 'seed'
#2001
ccleavinger
closed
2 weeks ago
1
Recalculate Returns and Advantages After Callback to Ensure Reward Consistency (common/on_policy_algorithm.py)
#2000
mhyrzt
opened
4 weeks ago
2
Is possible to filter experience from the episode whose length is longer than a specified value to add into replay_buffer?
#1999
CornfileChase
opened
1 month ago
1
Logger information
#1998
XiaobenLi00
opened
1 month ago
1
[Question] About the logger
#1997
XiaobenLi00
opened
1 month ago
1
[Feature Request] Add support for optional environment wrapping in base_class.py
#1996
hasan-yaman
closed
1 month ago
2
[Bug]: No metrics logged when using wandb integrations
#1995
XiaobenLi00
closed
2 weeks ago
3
[Feature Request] add_scalars to wirte func in TensorBoardOutputFormat in logger
#1994
shimonShouei
opened
1 month ago
2
Fix test device for buffers
#1993
araffin
closed
1 month ago
0
observation_space does not match reset() observation and The environment is being initialised with render_mode='human' that is not in the possible render_modes ([])
#1992
XiaobenLi00
closed
1 month ago
5
That WORKED !!
#1991
XiaobenLi00
closed
1 month ago
0
[Question] Passing arguments to an environment that can't be pickled.
#1990
abhineet-gupta
closed
1 month ago
3
Clarification on Dependency Between Elements in Action Generation
#1989
fardinabbasi
closed
1 month ago
4
[Question] About the output layer of algorithms
#1988
abdulkadrtr
closed
1 month ago
1
[Question] New to this and can't get it installed plz help
#1987
Misticfury
closed
1 month ago
1
[Bug]: Possible inconsistencies with the PPO implementation
#1986
hexonfox
opened
2 months ago
2
Custom actor and critic network
#1985
krishdotn1
closed
1 month ago
4
[Feature Request] Temporal Convolutional network
#1984
tty666
opened
2 months ago
1
[Bug]: 'CarRacing' object has no attribute 'num_envs'
#1983
kuds
closed
2 months ago
1
[Вопрос] how to train 2 models in parallel?
#1982
kozlolet
closed
1 month ago
1
Fix various typos
#1981
cschindlbeck
closed
2 months ago
0
[Feature Request] Additional Callback before stepping the optimizer
#1980
kcorder
closed
2 months ago
2
Add precommit config yaml and fix typos automatically
#1979
cschindlbeck
opened
2 months ago
3
Fix loading of optimizer with older DQN models
#1978
araffin
closed
2 months ago
2
[Question] Questions about CNN policy input channel
#1977
DavidLudl
closed
2 months ago
2
I would like to know what is the network structure of the SAC algorithm
#1976
hjg857
closed
2 months ago
1
Add support for pre and post linear modules in `create_mlp`
#1975
araffin
closed
2 months ago
0
[Bug]: Load a PPO model and re-start learning
#1974
nrigol
closed
2 months ago
1
[Question] Using SubprocVecEnv with a custom environment leads not to a continuous output on the terminal
#1973
wilhem
closed
2 months ago
3
Actions are generated out of the range
#1972
fardinabbasi
closed
2 months ago
2
Pre-trainned resnet module output NAN
#1971
LeZheng-x
closed
2 months ago
1
[Bug]: Crash when importing PPO module
#1970
izycheva
closed
2 months ago
4
Update examples.rst
#1969
qgallouedec
closed
2 months ago
0
[Question] Question about VecEnv using a custom environment in SB3
#1968
wilhem
closed
1 month ago
33
Next