issues
search
hill-a
/
stable-baselines
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k
stars
725
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[feature request] Remove erroneous episode from replay buffer
#1197
WreckItTim
opened
1 month ago
0
unflatten always gives `0` class
#1196
wadmes
opened
2 months ago
0
Questions about CNN policy input channel [question]
#1195
DavidLudl
opened
4 months ago
0
AssertionError: The observation returned by the `step()` method does not match the given observation space Discrete(2)
#1194
lpj20
opened
5 months ago
0
Using Multiple environment with unity ML agents
#1193
871234342
opened
8 months ago
0
resume the training
#1192
rambo1111
opened
9 months ago
0
Two similar custom environment, PPO learns on both but SAC only on one
#1191
tfederico
closed
9 months ago
1
MlpPolicy network output layer softmax activation for continuous action space problem?
#1190
wbzhang233
opened
10 months ago
2
Model generated using VecNormalize, model predict use case
#1189
madhekar
closed
1 year ago
1
Resume Training on Checkpoint
#1188
mingjohnson
opened
1 year ago
1
VecNormalize 'training' attribute
#1187
madhekar
opened
1 year ago
0
Customize training process
#1186
dunhuiliu
opened
1 year ago
0
assertionerror, observation space , vectransposeimage
#1185
skgr07
opened
1 year ago
0
Dimension mismatch with when using custom Feature Extractor
#1184
yassinetb
closed
1 year ago
0
[Question] Can I train agents in a nested loop in SB3?
#1183
j-thib
closed
1 year ago
0
python setup.py egg_info did not run successfully.
#1182
perp1exed
opened
1 year ago
0
[question] TypeError: 'NoneType' object is not callable with user defined env
#1181
Charles-Lim93
opened
1 year ago
0
DQN report. [QUESTION]
#1180
smbrine
opened
1 year ago
0
#question _on_step method in custom callback
#1179
vrige
opened
1 year ago
0
Can I use an agent with act, and observe interactions with no/minimum use of environment?
#1178
aheidariiiiii1993
opened
1 year ago
0
How to create an actor-critic network with two separate LSTMs
#1177
ashleychung830
opened
2 years ago
0
[question] PPO load using .pkl file
#1176
meric-sakarya
closed
2 years ago
1
How to convert timestep based learning to episodic learning
#1175
muk465
opened
2 years ago
1
TypeError: can't pickle dolfin.cpp.geometry.Point objects
#1174
jiangzhangze
opened
2 years ago
0
Custom gym Env Assertation error regarding reset () method
#1173
sheila-janota
opened
2 years ago
0
Adding the version warning banner on every documentation page
#1172
qgallouedec
closed
2 years ago
2
[question] for an RL algorithm with a discrete action space, is it possible to get a probability of outcomes when feeding in data?
#1171
george-adams1
opened
2 years ago
0
[Question]Callback collected model does not have same reward as training verbose[custom gym environment]
#1170
hotpotking-lol
opened
2 years ago
1
Store the training result
#1169
hotpotking-lol
closed
2 years ago
2
Environment checker returns assertion error contradicting debug statements
#1168
techboy-coder
closed
2 years ago
2
True rewards remaining "zero" in the trajectories in stable baselines2 for custom environments
#1167
moizuet
opened
2 years ago
6
Deep Q-value network evaluation in SAC algorithm
#1166
moizuet
opened
2 years ago
2
Link to gym docs on creating cusotm environment broken
#1165
arjun-krishna1
opened
2 years ago
0
1D Vector of floats as an observation space
#1164
WilliamFlinchbaugh
opened
2 years ago
3
model.num_timesteps or a similar method inside a SubprocVecEnv
#1163
olyanos
closed
2 years ago
2
customenv return from reset does not match the observation space
#1162
RishiKasam
closed
2 years ago
1
FPS varies enormously
#1161
leo2r
closed
2 years ago
4
Cannot install stable baselines 3
#1160
OishikGuha
closed
2 years ago
4
Accessing observations during training aka .learn()
#1159
user-1701
closed
2 years ago
2
Is it wrong to reward an action on the next step?
#1158
DaniilKardava
closed
2 years ago
1
Data normalization for a2c inputs?
#1157
DaniilKardava
closed
2 years ago
1
Prediction for same observation using same model
#1156
DaniilKardava
closed
2 years ago
1
Invalid Actions, Mask and DQN
#1155
Cyazd
closed
2 years ago
3
Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2
#1154
durantagre
opened
2 years ago
1
Minigrid --" Kernel size can't be greater than actual input size" for DQN
#1153
raymond2338
closed
2 years ago
1
Running Stable Baselines on M1 Macs?
#1152
adamnhaka
opened
2 years ago
1
Unable to see stable-baselines output
#1151
Michael-HK
closed
2 years ago
7
SAC results with large variance
#1150
dibbla
closed
2 years ago
1
Load model to Re-train. " 'NoneType' object has no attribute 'reset' "
#1149
malik-ben
closed
2 years ago
4
[question]A problem of how to use MlpLstmPolicy in GAIL training?
#1148
LongchaoDa
closed
2 years ago
3
Next