Differences between the code and the paper

Hello,

Thanks for your attention on our work. There are a few details missing in the paper due to space limit.

First, as introduced in the Remark of Section 3.2.4 in the paper, adding a discrepency loss is a traditional way to achieve disentanglement, however, it does not fit our case of disentangling long and short-term user interest since the two aspects can also overlap with each other to some extent. In our codes, we implement such discrepancy loss to investigate the effect of this traditional approach. Through experiments with different values of discrepancy_loss_weight, we find that the discrepancy loss brings no benefits. Specifically, the performance of setting discrepancy_loss_weight as a very small value of 0.01 is roughly the same as setting discrepancy_loss_weight as 0. Meanwhile, increasing discrepancy_loss_weight to 0.1 will bring performance drop. You can set discrepancy_loss_weight as other values to further investigate the effect of discrepancy loss.

Second, for the initial state of the interest_evolve GRU and the short-term query, these details are not introduced in the paper since they are not directly related to the main idea of self-supervised disentanglement. Please refer to our provided codes and we may update the paper on arxiv to include these details.

Thank you again!

tsinghua-fib-lab / CLSR

Differences between the code and the paper #16