issues
search
uidilr
/
gail_ppo_tf
Tensorflow implementation of Generative Adversarial Imitation Learning(GAIL) with discrete action
MIT License
112
stars
29
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
gaes = (gaes - gaes.mean()) / gaes.std()
#22
Joll123
closed
4 years ago
1
ppo implementation and kl divergence of gail paper
#21
mehdimashayekhi
closed
6 years ago
1
roll out of the policy and collecting data, especially reward
#20
mehdimashayekhi
closed
6 years ago
1
no render for cartpole-v0
#19
navuboy
closed
6 years ago
1
is there any way that I can train the Generator with batches from previous roll outs (something like reply buffer) ?
#18
shamanez
closed
6 years ago
1
Missing (-1) in the loss function in BC
#17
shamanez
closed
6 years ago
1
Is there any way we can encourage the exploration strategy of the agent ?
#16
shamanez
closed
6 years ago
1
can you implement WGAN loss as the discriminator loss ?
#15
shamanez
closed
6 years ago
1
Where do you use the get_grad function ?
#14
shamanez
closed
6 years ago
1
Visualize the Gradients in the discriminator on the Tensor-board
#13
shamanez
closed
6 years ago
4
Discriminator statistics are not storing on the tensor board
#12
shamanez
closed
6 years ago
1
Application of Batch Normalization and Drop Out
#11
shamanez
closed
6 years ago
1
Discriminator Training with the expert's trajectories ?
#10
shamanez
closed
6 years ago
1
GAIL Training - Train the policy net fist with behavioural cloning and fine tune with the GAIL ?
#9
shamanez
closed
6 years ago
4
Your implementation is different from the Gail paper
#8
Guiliang
closed
6 years ago
4
What is "temp" value used in getting the output of policy network ?
#7
shamanez
closed
6 years ago
1
Discriminator optimisation with each [state,action]
#6
shamanez
closed
6 years ago
6
Discriminator optimisation in GAIL
#5
shamanez
closed
6 years ago
1
About the old an new policy ?
#4
shamanez
closed
6 years ago
1
About the critic network optimisation ?
#3
shamanez
closed
6 years ago
2
How did you implement proximal Policy Gradients in run_ppo.py?
#2
shamanez
closed
6 years ago
2
some issue about the set of 'stochastic'
#1
LinBornRain
closed
6 years ago
4