Hi torchdrug team, thank you for the awesome project! I am interested in trying to reproduce GCPN according to your tutorial. But I found that the reward design in your code is different from the GCPN in original paper. Reward in origin paper includes adversarial loss and property score. However, I just find property score as reward in your code.
if task == "plogp":
plogp = metrics.penalized_logP(graph)
metric["Penalized logP"] = plogp.mean()
metric["Penalized logP (max)"] = plogp.max()
self.update_best_result(graph, plogp, "Penalized logP")
reward += (plogp / self.reward_temperature).exp()
` ``
Did i miss something? Looking forward to your reply!
Hi, basically you didn't miss anything and you are right. We remove the adversarial loss in our implementation to simplify the model, and you are encouraged to customize your own rewards.
Hi torchdrug team, thank you for the awesome project! I am interested in trying to reproduce GCPN according to your tutorial. But I found that the reward design in your code is different from the GCPN in original paper. Reward in origin paper includes adversarial loss and property score. However, I just find property score as reward in your code.