Thanks for sharing your great codes, which are very helpful to me.
It is just that there seems to be a minor issue regarding the topic diversity regularization term.
is something that we need to maximize.
So expressed in loss function, the regularization should be
lambda*(var-mean)
which corresponds to the text in the paper
During training, the mean angle is encouraged to be larger while the variance is suppressed to be smaller so that all of the topics will be pushed away from each other in the topic semantic space
However, it seems that the code forgot to flip the sign of the result obtained from this line
Thanks for sharing your great codes, which are very helpful to me. It is just that there seems to be a minor issue regarding the topic diversity regularization term.
In the GSM paper, the objective in the appendix
is something that we need to maximize. So expressed in loss function, the regularization should be
which corresponds to the text in the paper
However, it seems that the code forgot to flip the sign of the result obtained from this line
https://github.com/YongfeiYan/Neural-Document-Modeling/blob/763972476f391872eec8de73472cf836f08ee054/models/utils.py#L181
and directly used the result as a penalty term as in here
https://github.com/YongfeiYan/Neural-Document-Modeling/blob/763972476f391872eec8de73472cf836f08ee054/models/NTM.py#L50
What do you think?
Looking forward to your reply. Thank again.