DLR-RM stable-baselines3 issues

DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

https://stable-baselines3.readthedocs.io

MIT License

8.85k stars 1.68k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Update documentation

#2017 Dev1nW opened 3 days ago
0
[Bug]: obs_as_tensor RuntimeError with numpy==2.1.1 on MacOS13

#2016 nquetschlich closed 2 days ago
2
can not load PPO model when I use custom net_arch

#2015 krishdotn1 opened 4 days ago
0
[Question] Manually Controlling Actions During PPO Training

#2014 wayne-weiwei opened 1 week ago
1
[bug] Adaptive SAC: using logarithm of entropy coefficient to compute temperature objective instead of entropy coefficient

#2013 Mattia-sony opened 1 week ago
1
[Feature Request] Warn users when using GPU with `A2C`/`PPO` + update documentation

#2012 jws-1 opened 1 week ago
4
[Bug]: Episode start flag is never set for off policy algorithms

#2011 josndan opened 2 weeks ago
1
[Question] Auto-regressive manner policy network?

#2010 wadmes opened 2 weeks ago
1
A succesfully ppo trained agent demands some steps of re-training to make good predictions

#2009 tanielsfranklin closed 2 weeks ago
1
[Feature Request] Safe Reinforcement Learning & Multi-Objective Reinforcement Learning

#2008 cherrywoods closed 2 weeks ago
2
[Question] MARL using Stable Baselines3

#2007 Hamza-101 closed 2 weeks ago
1
[Question] Shared feature extractor and gradient

#2006 brn-dev closed 2 weeks ago
2
Fix tests for mps support

#2005 deathcoder opened 2 weeks ago
2
[Question] The entropy value is a negative number, and the entropy loss is a positive number.

#2004 YSAA1 closed 1 week ago
3
Warn users when using multi-dim MultiDiscrete obs space

#2003 araffin closed 3 weeks ago
0
[Question] Issues withe monitor not having seed parameter

#2002 AbhayGoyal closed 2 weeks ago
2
[Bug]: bug title SubprocVecEnv TypeError: reset() got an unexpected argument 'seed'

#2001 ccleavinger closed 2 weeks ago
1
Recalculate Returns and Advantages After Callback to Ensure Reward Consistency (common/on_policy_algorithm.py)

#2000 mhyrzt opened 4 weeks ago
2
Is possible to filter experience from the episode whose length is longer than a specified value to add into replay_buffer?

#1999 CornfileChase opened 1 month ago
1
Logger information

#1998 XiaobenLi00 opened 1 month ago
1
[Question] About the logger

#1997 XiaobenLi00 opened 1 month ago
1
[Feature Request] Add support for optional environment wrapping in base_class.py

#1996 hasan-yaman closed 1 month ago
2
[Bug]: No metrics logged when using wandb integrations

#1995 XiaobenLi00 closed 2 weeks ago
3
[Feature Request] add_scalars to wirte func in TensorBoardOutputFormat in logger

#1994 shimonShouei opened 1 month ago
2
Fix test device for buffers

#1993 araffin closed 1 month ago
0
observation_space does not match reset() observation and The environment is being initialised with render_mode='human' that is not in the possible render_modes ([])

#1992 XiaobenLi00 closed 1 month ago
5
That WORKED !!

#1991 XiaobenLi00 closed 1 month ago
0
[Question] Passing arguments to an environment that can't be pickled.

#1990 abhineet-gupta closed 1 month ago
3
Clarification on Dependency Between Elements in Action Generation

#1989 fardinabbasi closed 1 month ago
4
[Question] About the output layer of algorithms

#1988 abdulkadrtr closed 1 month ago
1
[Question] New to this and can't get it installed plz help

#1987 Misticfury closed 1 month ago
1
[Bug]: Possible inconsistencies with the PPO implementation

#1986 hexonfox opened 2 months ago
2
Custom actor and critic network

#1985 krishdotn1 closed 1 month ago
4
[Feature Request] Temporal Convolutional network

#1984 tty666 opened 2 months ago
1
[Bug]: 'CarRacing' object has no attribute 'num_envs'

#1983 kuds closed 2 months ago
1
[Вопрос] how to train 2 models in parallel?

#1982 kozlolet closed 1 month ago
1
Fix various typos

#1981 cschindlbeck closed 2 months ago
0
[Feature Request] Additional Callback before stepping the optimizer

#1980 kcorder closed 2 months ago
2
Add precommit config yaml and fix typos automatically

#1979 cschindlbeck opened 2 months ago
3
Fix loading of optimizer with older DQN models

#1978 araffin closed 2 months ago
2
[Question] Questions about CNN policy input channel

#1977 DavidLudl closed 2 months ago
2
I would like to know what is the network structure of the SAC algorithm

#1976 hjg857 closed 2 months ago
1
Add support for pre and post linear modules in `create_mlp`

#1975 araffin closed 2 months ago
0
[Bug]: Load a PPO model and re-start learning

#1974 nrigol closed 2 months ago
1
[Question] Using SubprocVecEnv with a custom environment leads not to a continuous output on the terminal

#1973 wilhem closed 2 months ago
3
Actions are generated out of the range

#1972 fardinabbasi closed 2 months ago
2
Pre-trainned resnet module output NAN

#1971 LeZheng-x closed 2 months ago
1
[Bug]: Crash when importing PPO module

#1970 izycheva closed 2 months ago
4
Update examples.rst

#1969 qgallouedec closed 2 months ago
0
[Question] Question about VecEnv using a custom environment in SB3

#1968 wilhem closed 1 month ago
33