Closed maz0318 closed 4 years ago
Thanks for your reviews.
I recently update my codes according to the mistakes you mentioned, please check out the latest codes.
In the latest codes, by the way, the torch.mean()
function only execute in the last dimension, which represents the rollout number of the SeqGAN.
Ok!Thanks for your reply.I will check out it.
Hello,Thanks for your sharing your code. But In your implement of seqgan, I have a problems: That is computing reward(in uitls.rollout), why you use a mean Q-value(reward) to replace every Q-value? I think it should be every Q-value multiply log(P(y_t|Y1:Y{t-1})).