hill-a stable-baselines issues

hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

http://stable-baselines.readthedocs.io/

MIT License

4.16k stars 725 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

In gym env ,how to put prediction about future state in current state?

#1097 MariamDundua closed 3 years ago
3
GAIL throws error when obs space is MultiDiscrete

#1096 SurferZergy opened 3 years ago
1
How does creation of multiple runners with A2C work?

#1095 Pit-Storm closed 3 years ago
6
PPO2 assertion that should probably be a warning

#1094 jkterry1 closed 3 years ago
2
Custom FeedForwardPolicy does not care for net_arch parameters

#1093 HadiSDev closed 3 years ago
5
Hill a merge

#1092 keshaviyengar closed 3 years ago
0
Hill a merge

#1091 keshaviyengar closed 3 years ago
0
[question] How to insert experiences into replay buffers?

#1090 Wesleyliao closed 3 years ago
1
PPO2 load model

#1089 lnix closed 3 years ago
1
I get different rewards on same env and same steps and same model. Is it normal?

#1088 mmterkc closed 3 years ago
2
How can i save best model ?

#1087 mmterkc closed 3 years ago
5
Input position and image into DQN policy

#1086 CoolCoder54323 closed 3 years ago
14
Installing on macOS Big Sur 11.1 with Python 3.9

#1085 jemgodden closed 3 years ago
1
Question about how is stacking done in vecFrameStack

#1084 eliork closed 3 years ago
5
How do ppo2 reset environment

#1083 dongfangliu closed 3 years ago
1
mb_rewards is updated after the on_step callback

#1082 SolaWeng closed 3 years ago
2
Switch to GitHub workflow + faster tests

#1081 araffin closed 3 years ago
2
Question about train.py printouts.

#1080 blurLake closed 3 years ago
0
Cannot see what changes required to support SubprocVecEnv

#1079 davidwynter closed 3 years ago
5
Is it possible to add an auxiliary output to an algorithm?

#1078 mitchellostrow closed 3 years ago
1
[question] testing with normalization env wrapper

#1077 ggenesum closed 3 years ago
4
Reinforcement learning of tic-tac-toe is not possible.

#1076 loySoGxj closed 3 years ago
2
[Question] GAIL : ValueError: Shape must be rank 3 but is rank 2 for 'adversary/concat' .....

#1075 romain-mondelice closed 3 years ago
3
[Question] Error because of dimension of observation in GAIL

#1074 SiweiJu closed 3 years ago
0
[question] Custom environment conflict with PyCuda

#1073 ivandrodri closed 3 years ago
5
Multiprocessing not working for PPO1

#1072 eflopez1 closed 3 years ago
1
[feature request] LstmPolicy does not support using net_arch with feature_extraction="cnn"

#1071 GiliR4t1qbit opened 3 years ago
3
Pretrain dimension issue

#1070 jmm1-cmd closed 3 years ago
3
[question] EvalCallback using MPI

#1069 davidADSP opened 3 years ago
5
Is there a way to know the model architecture ?

#1068 xunpla123 closed 3 years ago
6
Copy version.txt to docker container

#1067 anj1 closed 3 years ago
4
docker build fails with FileNotFoundError: 'stable_baselines/version.txt'

#1066 anj1 closed 3 years ago
3
How to set 'episode_lenght' when using 'generate_expert_traj'?

#1065 luigicampanaro closed 3 years ago
4
How to blend the policy networks?

#1064 dongfangliu closed 3 years ago
1
Issue with MlpLstm policy

#1063 lorenzoschena closed 3 years ago
8
ValueError: all input arrays must have the same shape

#1062 aliamiri1380 closed 3 years ago
5
Value estimation update?

#1061 yiwc closed 3 years ago
2
Fixed bug in log probability calculation for Diagonal Gaussian distribution

#1060 SVJayanthi closed 3 years ago
2
Error in calculating log probability in base_class.py

#1059 sunshineclt closed 3 years ago
2
Duplicate exp in action probability calculation

#1058 sunshineclt closed 3 years ago
2
observation_space problem

#1057 asd3200asd closed 3 years ago
3
Resnet Observation Policy Cannot Reload?#question#bug

#1056 yiwc closed 3 years ago
1
ACKTR hangs on atari and works very slow on custom env

#1055 mily20001 opened 3 years ago
2
[question] Using PPO2 on multiple cluster nodes (MPI)

#1054 piotti closed 3 years ago
3
What exactly does the output printed to the standard output mean when verbose=1 in DQN?

#1053 nbro closed 2 years ago
2
Does stable baselines provide an automatic way of computing the sample efficiency of an RL algorithm?

#1052 nbro opened 3 years ago
1
(Question) Change the value of a default variable of the env using make_vec_env

#1051 Chaivara closed 3 years ago
1
[Question] where is self.episode_reward updated in PPO2

#1050 zhaopansong closed 3 years ago
2
[Question] total_episode_reward_logger is wrongly handled due to the way of storing dones

#1049 zhaopansong closed 3 years ago
4
Sharing data between subprocesses [question]

#1048 rporotti closed 3 years ago
2

Previous Next