I have tried to use L2 reward in ddpg.py line 102 and cancel WGAN optimization, but after the same iterations, this painter is not as good as WGAN reward.
Kindly, how do you make L2 rewards work?
Hi, as shown in our paper, l2 reward can not get same performance as WGAN reward. Please refer to SPIRAL paper for more details. https://arxiv.org/abs/1804.01118, supplementary
I have tried to use L2 reward in ddpg.py line 102 and cancel WGAN optimization, but after the same iterations, this painter is not as good as WGAN reward. Kindly, how do you make L2 rewards work?