hill-a stable-baselines issues

hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

http://stable-baselines.readthedocs.io/

MIT License

4.16k stars 725 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[feature request] Remove erroneous episode from replay buffer

#1197 WreckItTim opened 1 month ago
0
unflatten always gives `0` class

#1196 wadmes opened 2 months ago
0
Questions about CNN policy input channel [question]

#1195 DavidLudl opened 4 months ago
0
AssertionError: The observation returned by the `step()` method does not match the given observation space Discrete(2)

#1194 lpj20 opened 5 months ago
0
Using Multiple environment with unity ML agents

#1193 871234342 opened 8 months ago
0
resume the training

#1192 rambo1111 opened 9 months ago
0
Two similar custom environment, PPO learns on both but SAC only on one

#1191 tfederico closed 9 months ago
1
MlpPolicy network output layer softmax activation for continuous action space problem?

#1190 wbzhang233 opened 10 months ago
2
Model generated using VecNormalize, model predict use case

#1189 madhekar closed 1 year ago
1
Resume Training on Checkpoint

#1188 mingjohnson opened 1 year ago
1
VecNormalize 'training' attribute

#1187 madhekar opened 1 year ago
0
Customize training process

#1186 dunhuiliu opened 1 year ago
0
assertionerror, observation space , vectransposeimage

#1185 skgr07 opened 1 year ago
0
Dimension mismatch with when using custom Feature Extractor

#1184 yassinetb closed 1 year ago
0
[Question] Can I train agents in a nested loop in SB3?

#1183 j-thib closed 1 year ago
0
python setup.py egg_info did not run successfully.

#1182 perp1exed opened 1 year ago
0
[question] TypeError: 'NoneType' object is not callable with user defined env

#1181 Charles-Lim93 opened 1 year ago
0
DQN report. [QUESTION]

#1180 smbrine opened 1 year ago
0
#question _on_step method in custom callback

#1179 vrige opened 1 year ago
0
Can I use an agent with act, and observe interactions with no/minimum use of environment?

#1178 aheidariiiiii1993 opened 1 year ago
0
How to create an actor-critic network with two separate LSTMs

#1177 ashleychung830 opened 2 years ago
0
[question] PPO load using .pkl file

#1176 meric-sakarya closed 2 years ago
1
How to convert timestep based learning to episodic learning

#1175 muk465 opened 2 years ago
1
TypeError: can't pickle dolfin.cpp.geometry.Point objects

#1174 jiangzhangze opened 2 years ago
0
Custom gym Env Assertation error regarding reset () method

#1173 sheila-janota opened 2 years ago
0
Adding the version warning banner on every documentation page

#1172 qgallouedec closed 2 years ago
2
[question] for an RL algorithm with a discrete action space, is it possible to get a probability of outcomes when feeding in data?

#1171 george-adams1 opened 2 years ago
0
[Question]Callback collected model does not have same reward as training verbose[custom gym environment]

#1170 hotpotking-lol opened 2 years ago
1
Store the training result

#1169 hotpotking-lol closed 2 years ago
2
Environment checker returns assertion error contradicting debug statements

#1168 techboy-coder closed 2 years ago
2
True rewards remaining "zero" in the trajectories in stable baselines2 for custom environments

#1167 moizuet opened 2 years ago
6
Deep Q-value network evaluation in SAC algorithm

#1166 moizuet opened 2 years ago
2
Link to gym docs on creating cusotm environment broken

#1165 arjun-krishna1 opened 2 years ago
0
1D Vector of floats as an observation space

#1164 WilliamFlinchbaugh opened 2 years ago
3
model.num_timesteps or a similar method inside a SubprocVecEnv

#1163 olyanos closed 2 years ago
2
customenv return from reset does not match the observation space

#1162 RishiKasam closed 2 years ago
1
FPS varies enormously

#1161 leo2r closed 2 years ago
4
Cannot install stable baselines 3

#1160 OishikGuha closed 2 years ago
4
Accessing observations during training aka .learn()

#1159 user-1701 closed 2 years ago
2
Is it wrong to reward an action on the next step?

#1158 DaniilKardava closed 2 years ago
1
Data normalization for a2c inputs?

#1157 DaniilKardava closed 2 years ago
1
Prediction for same observation using same model

#1156 DaniilKardava closed 2 years ago
1
Invalid Actions, Mask and DQN

#1155 Cyazd closed 2 years ago
3
Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2

#1154 durantagre opened 2 years ago
1
Minigrid --" Kernel size can't be greater than actual input size" for DQN

#1153 raymond2338 closed 2 years ago
1
Running Stable Baselines on M1 Macs?

#1152 adamnhaka opened 2 years ago
1
Unable to see stable-baselines output

#1151 Michael-HK closed 2 years ago
7
SAC results with large variance

#1150 dibbla closed 2 years ago
1
Load model to Re-train. " 'NoneType' object has no attribute 'reset' "

#1149 malik-ben closed 2 years ago
4
[question]A problem of how to use MlpLstmPolicy in GAIL training?

#1148 LongchaoDa closed 2 years ago
3