GAIL Training - Train the policy net fist with behavioural cloning and fine tune with the GAIL ?

uidilr / gail_ppo_tf

Tensorflow implementation of Generative Adversarial Imitation Learning(GAIL) with discrete action

MIT License

112 stars 29 forks source link

GAIL Training - Train the policy net fist with behavioural cloning and fine tune with the GAIL ? #9

Closed shamanez closed 6 years ago

shamanez commented 6 years ago

Hi can you merge this in a efficient way to the code? Or can you give me some advice where should I load the pre-retrained weights to the policy network ?

Next Question - What do you think about training GAIL with high dimensional data ? Let's say with images ?

uidilr commented 6 years ago

I gave it try to use pre-training weight once, but training is failed. Since I deleted that code, I will just give some advices.

Define new saver of trainable variables in the policy net.
Load the pre-trained weights after this line, https://github.com/uidilr/gail_ppo_tf/blob/f8f496f69c2a166e91164d9082d7f9fd5a80b6b5/run_gail.py#L36

Next Question - What do you think about training GAIL with high dimensional data ? Let's say with images ?

I think it is difficult to solve. This is all I can say.

shamanez commented 6 years ago

Training failed means your code didn't work or It's the training procedure or the accuracy?

uidilr commented 6 years ago

It means that accuracy did not improve.

shamanez commented 6 years ago

Can you explain one thing about the pre-trained GAIL model you have given in the trained models? How long it took to converge? ALso the specs of ur machine