yandexdataschool / Practical_RL

A course in reinforcement learning in the wild
The Unlicense
5.94k stars 1.7k forks source link

Can I get answer for my confusion in week07_seq2seq/practice_torch #499

Open lancescrazy opened 2 years ago

lancescrazy commented 2 years ago

I am confused deeply about logp_sample and entropy in Function scst_objective_on_batch in week07_seq2seq/practice_torch. But there is no correct answer as a reference. Recently I want to reproduce the seq2seq with RL like week07 task, but I'm stuck in these two places. Many thanks to the all teachers of the course group. I hope someone can answer my confusion.