hill-a stable-baselines issues

hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

http://stable-baselines.readthedocs.io/

MIT License

4.14k stars 723 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[question] How to use previously obtained state-action-reward-next state information to save time on training?

#1146 kwak9601 closed 2 years ago
1
[question] How to get the model architecture when using recurrent policy?

#1145 borninfreedom closed 2 years ago
8
import stable-baselines [Question]

#1144 MariaPiaGelos closed 2 years ago
1
PPO ValueError: The parameter loc has invalid values

#1143 olyanos closed 2 years ago
1
No protocol specified [Bug]

#1142 lafmdp closed 2 years ago
4
evaluate_policy() crashes with PPO2 policies trained on vectorized environments [bug]

#1141 balisujohn opened 2 years ago
5
PPO2 implementation details?

#1140 FabioPINO opened 3 years ago
3
[question] What is the proper way to log metrics at the end of each epoch when epochs are variable in length?

#1139 DavidBellamy opened 3 years ago
5
[question] Loading the PPO model after training does seem to load the policy

#1138 Milad-Rakhsha closed 3 years ago
5
TD3 & DDPG: RuntimeError: "normal_kernel_cuda" not implemented for 'Char'

#1137 olyanos closed 3 years ago
4
Cannot load pre-trained model to evaluate properly

#1136 wenjunli-0 closed 3 years ago
4
PPO - Meaning of update_fac and timestep variables

#1135 huvar opened 3 years ago
1
Resume Training with Previous Experience (state-action-state')?

#1134 wenjunli-0 opened 3 years ago
6
Fix re-training with different number of environments

#1133 balisujohn closed 3 years ago
3
Fix for PPO2 When loading a model and then training with vectorized environment with a different vector length

#1132 balisujohn closed 3 years ago
0
How to set constant learning rate in PPO1?

#1131 nicehzj closed 3 years ago
2
What is the difference between 'train' and 'learn' in ppo???

#1130 wq13552463699 closed 3 years ago
1
evaluate_policy of MlpLstmPolicy with DummyVecEnv

#1129 LeZhengThu closed 3 years ago
4
Tensorboard HPARAMS with DDQN #question

#1128 Arione94 opened 3 years ago
4
hyperparameter tuning of PPO2 with MlpLstmPolicy using Optuna

#1127 LeZhengThu closed 3 years ago
11
[Question] "Curriculum" learning-like training in stablebaselines3

#1126 PatrickSampaioUSP closed 3 years ago
3
LinearAnneal

#1125 JulioEstebanAsiainNeno closed 3 years ago
1
Installation Error: Stable_baselines

#1124 JingZhang918 closed 3 years ago
7
Breakout environment doesn't exist.

#1123 Michi-123 closed 3 years ago
3
[question] VecNormalize with hyper parameter tuning

#1122 aleksanderhan closed 3 years ago
6
[Question] How can I initialize weights of MLP policy by some customized values?

#1121 zrz961203 closed 3 years ago
1
SubprocVecEnv produces identical outputs for all sub-processes

#1120 acertainKnight closed 3 years ago
10
[question] How to implement custom policy for TRPO

#1119 jeferal closed 3 years ago
1
Fix pretraining more than once Issue #538

#1118 imontesino closed 3 years ago
4
Implementing of CnnLstmPolicy with net_arch parameter

#1117 HighExecutor opened 3 years ago
0
Revision of CnnLstmPolicy with not None net_arch

#1116 HighExecutor opened 3 years ago
5
why doesn't env.env_method("reset") reset the environment?

#1115 neonine2 closed 3 years ago
2
VecNormalize for multiple training environments?

#1114 jdshaolinstar opened 3 years ago
4
What is the point of having DummyVecEnv if it is running sequentially?

#1113 jingxixu closed 3 years ago
7
Update from mirror

#1112 araffin closed 3 years ago
0
check_env warning - clarification for custom environment

#1111 amjass12 closed 3 years ago
4
HER not sampling from replay buffer?

#1110 OGordon100 closed 3 years ago
1
Custom Policy

#1109 candygocandy closed 3 years ago
3
[question] Pretraining with custom GoalEnv environment

#1108 OGordon100 closed 3 years ago
2
error

#1107 elizmarg closed 3 years ago
3
[question] How to access "Callbacks - Accessible Variables" for DQN model?

#1106 neonine2 closed 3 years ago
4
[question] Is GPU support for SAC or will it be supported?

#1105 blurLake closed 3 years ago
1
What is the role of the lower and higher bound in the Box for the observation space?

#1104 outdoteth closed 3 years ago
2
MlpPolicy with MultiDiscrete environment

#1103 mg64ve closed 3 years ago
7
Fixed typo

#1102 roccivic closed 3 years ago
4
Model training and testing the same dataset does not perform the same

#1101 yuvaleck closed 3 years ago
3
[question] Suggested Hyperparams for A2C with highway-env

#1100 pierrekhouryy opened 3 years ago
4
Steady Memory Increase When Running Example

#1099 danieldugas closed 3 years ago
2
[question] How to rollout the learned policy?

#1098 GlennCeusters closed 3 years ago
1
In gym env ,how to put prediction about future state in current state?

#1097 MariamDundua closed 3 years ago
3

Previous Next