cbfinn / gps

Guided Policy Search
http://rll.berkeley.edu/gps/
Other
597 stars 241 forks source link

Different definitions of "Loss of Supervised Learning" between the paper and the code #111

Open wuweijia1994 opened 5 years ago

wuweijia1994 commented 5 years ago

In the paper, we could see the loss during the supervising learning should be the KL divergence of between the trajectories and the neural network output: image

However, in the code, why it becomes the || label - output ||2? image image image

Is it because of the convenience of implementation?

dujinyu commented 4 years ago

I have many questions to consult you, can you add QQ 1242642280? @wuweijia1994