Question about the paper

clvrai / skill-chaining

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization (CoRL 2021)

28 stars 4 forks source link

Hi Jiayuan,

I think that's a fair concern that the discriminator will become too good at discriminating between the terminal states of one skill and initial states of the following skill. This is a well-known issue of GAN training (e.g. a discriminator says generated images are fake fairly well but with careful learning rate scheduling the generator can learn and eventually produce realistic images), which leads to unstable training in many cases. This is the same in our situation.

Our method is also using adversarial training, so our method sometimes suffers from unstable training. But, with careful tuning, we could make it work on long skill chaining.

clvrai / skill-chaining

Question about the paper #2