issues
search
hill-a
/
stable-baselines
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k
stars
725
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
In gym env ,how to put prediction about future state in current state?
#1097
MariamDundua
closed
3 years ago
3
GAIL throws error when obs space is MultiDiscrete
#1096
SurferZergy
opened
3 years ago
1
How does creation of multiple runners with A2C work?
#1095
Pit-Storm
closed
3 years ago
6
PPO2 assertion that should probably be a warning
#1094
jkterry1
closed
3 years ago
2
Custom FeedForwardPolicy does not care for net_arch parameters
#1093
HadiSDev
closed
3 years ago
5
Hill a merge
#1092
keshaviyengar
closed
3 years ago
0
Hill a merge
#1091
keshaviyengar
closed
3 years ago
0
[question] How to insert experiences into replay buffers?
#1090
Wesleyliao
closed
3 years ago
1
PPO2 load model
#1089
lnix
closed
3 years ago
1
I get different rewards on same env and same steps and same model. Is it normal?
#1088
mmterkc
closed
3 years ago
2
How can i save best model ?
#1087
mmterkc
closed
3 years ago
5
Input position and image into DQN policy
#1086
CoolCoder54323
closed
3 years ago
14
Installing on macOS Big Sur 11.1 with Python 3.9
#1085
jemgodden
closed
3 years ago
1
Question about how is stacking done in vecFrameStack
#1084
eliork
closed
3 years ago
5
How do ppo2 reset environment
#1083
dongfangliu
closed
3 years ago
1
mb_rewards is updated after the on_step callback
#1082
SolaWeng
closed
3 years ago
2
Switch to GitHub workflow + faster tests
#1081
araffin
closed
3 years ago
2
Question about train.py printouts.
#1080
blurLake
closed
3 years ago
0
Cannot see what changes required to support SubprocVecEnv
#1079
davidwynter
closed
3 years ago
5
Is it possible to add an auxiliary output to an algorithm?
#1078
mitchellostrow
closed
3 years ago
1
[question] testing with normalization env wrapper
#1077
ggenesum
closed
3 years ago
4
Reinforcement learning of tic-tac-toe is not possible.
#1076
loySoGxj
closed
3 years ago
2
[Question] GAIL : ValueError: Shape must be rank 3 but is rank 2 for 'adversary/concat' .....
#1075
romain-mondelice
closed
3 years ago
3
[Question] Error because of dimension of observation in GAIL
#1074
SiweiJu
closed
3 years ago
0
[question] Custom environment conflict with PyCuda
#1073
ivandrodri
closed
3 years ago
5
Multiprocessing not working for PPO1
#1072
eflopez1
closed
3 years ago
1
[feature request] LstmPolicy does not support using net_arch with feature_extraction="cnn"
#1071
GiliR4t1qbit
opened
3 years ago
3
Pretrain dimension issue
#1070
jmm1-cmd
closed
3 years ago
3
[question] EvalCallback using MPI
#1069
davidADSP
opened
3 years ago
5
Is there a way to know the model architecture ?
#1068
xunpla123
closed
3 years ago
6
Copy version.txt to docker container
#1067
anj1
closed
3 years ago
4
docker build fails with FileNotFoundError: 'stable_baselines/version.txt'
#1066
anj1
closed
3 years ago
3
How to set 'episode_lenght' when using 'generate_expert_traj'?
#1065
luigicampanaro
closed
3 years ago
4
How to blend the policy networks?
#1064
dongfangliu
closed
3 years ago
1
Issue with MlpLstm policy
#1063
lorenzoschena
closed
3 years ago
8
ValueError: all input arrays must have the same shape
#1062
aliamiri1380
closed
3 years ago
5
Value estimation update?
#1061
yiwc
closed
3 years ago
2
Fixed bug in log probability calculation for Diagonal Gaussian distribution
#1060
SVJayanthi
closed
3 years ago
2
Error in calculating log probability in base_class.py
#1059
sunshineclt
closed
3 years ago
2
Duplicate exp in action probability calculation
#1058
sunshineclt
closed
3 years ago
2
observation_space problem
#1057
asd3200asd
closed
3 years ago
3
Resnet Observation Policy Cannot Reload?#question#bug
#1056
yiwc
closed
3 years ago
1
ACKTR hangs on atari and works very slow on custom env
#1055
mily20001
opened
3 years ago
2
[question] Using PPO2 on multiple cluster nodes (MPI)
#1054
piotti
closed
3 years ago
3
What exactly does the output printed to the standard output mean when verbose=1 in DQN?
#1053
nbro
closed
2 years ago
2
Does stable baselines provide an automatic way of computing the sample efficiency of an RL algorithm?
#1052
nbro
opened
3 years ago
1
(Question) Change the value of a default variable of the env using make_vec_env
#1051
Chaivara
closed
3 years ago
1
[Question] where is self.episode_reward updated in PPO2
#1050
zhaopansong
closed
3 years ago
2
[Question] total_episode_reward_logger is wrongly handled due to the way of storing dones
#1049
zhaopansong
closed
3 years ago
4
Sharing data between subprocesses [question]
#1048
rporotti
closed
3 years ago
2
Previous
Next