hw-du / CBiT

Implementation of the paper "Contrastive Learning with Bidirectional Transformers for Sequential Recommendation".
GNU General Public License v3.0
26 stars 6 forks source link

some problems #6

Closed arnold-em closed 1 year ago

arnold-em commented 1 year ago

in models\bert lines 14: self.out = nn.Linear(self.bert_hidden_units, args.num_items + 1) zero represent padding and the max represent the last item not the mask? because args.num_items + 1 represent 0-args.num_items but mask is args.num_items + 1 in trainers \bert scores[:, 0] = -999.999 scores[:, -1] = -999.999 # pad token and mask token should not appear in the logits output

i think the set maybe wrong

hw-du commented 1 year ago

We re-run all the experiments (comment scores[:, -1] = -999.999) and get exactly the same results. So if you think this line does not make sense you can comment this line and still get the same result.

arnold-em commented 1 year ago

Thank you for your reply. I know the purpose of these two lines is to prevent the model from selecting padding 0 and mask. What I mean is that the large output at the end of the linear layer is arg.num Item+1, then the final dimension of that Linear should be arg.num Item+2, because padding represents 0 and arg.num Item+1 represents mask, 1-arg.num Item represents items, so the last dimension of this Linear should be arg.num_ Item+2, if I'm wrong, please criticize and correct me. Thank you!!!

lines 14: self.out = nn.Linear(self.berthidden​​units, args.num_items + 1)

我们重新运行所有实验(评论scores[:, -1] = -999.999)并得到完全相同的结果。所以如果你认为这一行没有意义,你可以评论这一行并仍然得到相同的结果。

arnold-em commented 1 year ago

Thank you for your reply. I know the purpose of these two lines is to prevent the model from selecting padding 0 and mask. What I mean is that the large output at the end of the linear layer is arg.num Item+1, then the final dimension of that Linear should be arg.num Item+2, because padding represents 0 and arg.num Item+1 represents mask, 1-arg.num Item represents items, so the last dimension of this Linear should be arg.num_ Item+2, if I'm wrong, please criticize and correct me. Thank you!!!

lines 14: self.out = nn.Linear(self.berthidden​​units, args.num_items + 1)

We re-run all the experiments (comment scores[:, -1] = -999.999) and get exactly the same results. So if you think this line does not make sense you can comment this line and still get the same result.

Thank you for your reply. I know the purpose of these two lines is to prevent the model from selecting padding 0 and mask. What I mean is that the large output at the end of the linear layer is arg.num Item+1, then the final dimension of that Linear should be arg.num Item+2, because padding represents 0 and arg.num Item+1 represents mask, 1-arg.num Item represents items, so the last dimension of this Linear should be arg.num_ Item+2, if I'm wrong, please criticize and correct me. Thank you!!!

lines 14: self.out = nn.Linear(self.berthidden​​units, args.num_items + 1)

hw-du commented 1 year ago

If you decide to predict the logits for the mask token, then the final dimension of that Linear should be arg.num Item+2. If you decide not to predict the logits for the mask token, then the final dimension of that Linear should be arg.num Item+1. Both approaches are feasible.