issues
search
hill-a
/
stable-baselines
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.14k
stars
723
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[question] How to use previously obtained state-action-reward-next state information to save time on training?
#1146
kwak9601
closed
2 years ago
1
[question] How to get the model architecture when using recurrent policy?
#1145
borninfreedom
closed
2 years ago
8
import stable-baselines [Question]
#1144
MariaPiaGelos
closed
2 years ago
1
PPO ValueError: The parameter loc has invalid values
#1143
olyanos
closed
2 years ago
1
No protocol specified [Bug]
#1142
lafmdp
closed
2 years ago
4
evaluate_policy() crashes with PPO2 policies trained on vectorized environments [bug]
#1141
balisujohn
opened
2 years ago
5
PPO2 implementation details?
#1140
FabioPINO
opened
3 years ago
3
[question] What is the proper way to log metrics at the end of each epoch when epochs are variable in length?
#1139
DavidBellamy
opened
3 years ago
5
[question] Loading the PPO model after training does seem to load the policy
#1138
Milad-Rakhsha
closed
3 years ago
5
TD3 & DDPG: RuntimeError: "normal_kernel_cuda" not implemented for 'Char'
#1137
olyanos
closed
3 years ago
4
Cannot load pre-trained model to evaluate properly
#1136
wenjunli-0
closed
3 years ago
4
PPO - Meaning of update_fac and timestep variables
#1135
huvar
opened
3 years ago
1
Resume Training with Previous Experience (state-action-state')?
#1134
wenjunli-0
opened
3 years ago
6
Fix re-training with different number of environments
#1133
balisujohn
closed
3 years ago
3
Fix for PPO2 When loading a model and then training with vectorized environment with a different vector length
#1132
balisujohn
closed
3 years ago
0
How to set constant learning rate in PPO1?
#1131
nicehzj
closed
3 years ago
2
What is the difference between 'train' and 'learn' in ppo???
#1130
wq13552463699
closed
3 years ago
1
evaluate_policy of MlpLstmPolicy with DummyVecEnv
#1129
LeZhengThu
closed
3 years ago
4
Tensorboard HPARAMS with DDQN #question
#1128
Arione94
opened
3 years ago
4
hyperparameter tuning of PPO2 with MlpLstmPolicy using Optuna
#1127
LeZhengThu
closed
3 years ago
11
[Question] "Curriculum" learning-like training in stablebaselines3
#1126
PatrickSampaioUSP
closed
3 years ago
3
LinearAnneal
#1125
JulioEstebanAsiainNeno
closed
3 years ago
1
Installation Error: Stable_baselines
#1124
JingZhang918
closed
3 years ago
7
Breakout environment doesn't exist.
#1123
Michi-123
closed
3 years ago
3
[question] VecNormalize with hyper parameter tuning
#1122
aleksanderhan
closed
3 years ago
6
[Question] How can I initialize weights of MLP policy by some customized values?
#1121
zrz961203
closed
3 years ago
1
SubprocVecEnv produces identical outputs for all sub-processes
#1120
acertainKnight
closed
3 years ago
10
[question] How to implement custom policy for TRPO
#1119
jeferal
closed
3 years ago
1
Fix pretraining more than once Issue #538
#1118
imontesino
closed
3 years ago
4
Implementing of CnnLstmPolicy with net_arch parameter
#1117
HighExecutor
opened
3 years ago
0
Revision of CnnLstmPolicy with not None net_arch
#1116
HighExecutor
opened
3 years ago
5
why doesn't env.env_method("reset") reset the environment?
#1115
neonine2
closed
3 years ago
2
VecNormalize for multiple training environments?
#1114
jdshaolinstar
opened
3 years ago
4
What is the point of having DummyVecEnv if it is running sequentially?
#1113
jingxixu
closed
3 years ago
7
Update from mirror
#1112
araffin
closed
3 years ago
0
check_env warning - clarification for custom environment
#1111
amjass12
closed
3 years ago
4
HER not sampling from replay buffer?
#1110
OGordon100
closed
3 years ago
1
Custom Policy
#1109
candygocandy
closed
3 years ago
3
[question] Pretraining with custom GoalEnv environment
#1108
OGordon100
closed
3 years ago
2
error
#1107
elizmarg
closed
3 years ago
3
[question] How to access "Callbacks - Accessible Variables" for DQN model?
#1106
neonine2
closed
3 years ago
4
[question] Is GPU support for SAC or will it be supported?
#1105
blurLake
closed
3 years ago
1
What is the role of the lower and higher bound in the Box for the observation space?
#1104
outdoteth
closed
3 years ago
2
MlpPolicy with MultiDiscrete environment
#1103
mg64ve
closed
3 years ago
7
Fixed typo
#1102
roccivic
closed
3 years ago
4
Model training and testing the same dataset does not perform the same
#1101
yuvaleck
closed
3 years ago
3
[question] Suggested Hyperparams for A2C with highway-env
#1100
pierrekhouryy
opened
3 years ago
4
Steady Memory Increase When Running Example
#1099
danieldugas
closed
3 years ago
2
[question] How to rollout the learned policy?
#1098
GlennCeusters
closed
3 years ago
1
In gym env ,how to put prediction about future state in current state?
#1097
MariamDundua
closed
3 years ago
3
Previous
Next