Closed ven-kyoshiro closed 5 years ago
・DIAYN on notebook
[x] prepare HalfCheetah of pybullet
[x] search about half cheater preprocessing
[ ] learn HalfCheetah with SAC
[x] learn Reacher-v0 with SAC
l30 self.rewards = [float(self.potential - potential_old), float(electricity_cost), float(stuck_joint_cost)]
to
l30 self.rewards = [float(self.potential), float(electricity_cost), float(stuck_joint_cost)]
[x] define discriminator
[x] incorporate discriminator
[x] make evaluation code
[x] DIAYN on Reacher
[x] save discriminator_loss
[x] implements to the function to save halfway models
[x] discriminator didn't work on GPU .
[x] Combine default reward function and diayn
Paper
Branch