Regarding equation 5 in the paper

chengkai-liu / Mamba4Rec

[RelKD'24] Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models

https://arxiv.org/abs/2403.03900

MIT License

85 stars 3 forks source link

Regarding equation 5 in the paper #10

Open AlwaysFHao opened 6 months ago

AlwaysFHao commented 6 months ago

Hello, regarding the prediction layer formula 5 in the Mamba4Rec paper, $\hat{y} = Softmax(Linear(h)) \in \mathbb{R}^{| \mathcal{V} |}$ I think there is a problem as follows: the calculation of the probability distribution of the prediction layer prediction score in Recbole is not a logic in NLP. It should be based on calculating the similarity (inner product) between the embedding of the final output item and the initial embedding of all selected items to obtain the probability distribution of the prediction score, rather than directly using the linear layer to perform multi classification on the embedding of the final output item.The source code of Mamba4Rec also follows the above rules, which can be found in the calculate'loss and full_sort_predict methods

AlwaysFHao commented 6 months ago

The actual prediction layer formula should be: $\hat{y} = Softmax(h \cdot e) \in \mathbb{R}^{| \mathcal{V} |}$ where e is the original embedding feature of the item corresponding to \hat{y}.

chengkai-liu commented 6 months ago

Thank you for correcting our mistakes. To be accurate, it should be $\hat y=\text{Softmax}\left(h \boldsymbol{E}^\top \right) \in \mathbb{R}^{|\mathcal{V}|}$, where $h \in \mathbb{R}^D$ is the last item embedding from the Mamba layer. $\hat y \in \mathbb{R}^{|\mathcal{V}|}$ represents the probability distribution over the next item in the item set $\mathcal{V}$.

AlwaysFHao commented 6 months ago

Thank you for correcting our mistakes. To be accurate, it should be y^=Softmax(hE⊤)∈R|V|, where h∈RD is the last item embedding from the Mamba layer. y^∈R|V| represents the probability distribution over the next item in the item set V.

Yes, to be precise, the actual formula should be in the form you are giving now. In addition, I have reconstructed Mamba4Rec in a non Recbole environment, but there is still a gap in performance compared to your paper on the Beauty dataset, hit@10 It's 0.0813, but ndcg@10 Only 0.404, could you please run my code and reproduce the effect of Mamba4Rec on Beauty according to your actual parameters? If successful, please let me know?

https://github.com/AlwaysFHao/Mamba4RecWithoutRecbole

chengkai-liu commented 5 months ago

The minor gap in performance is acceptable and within expected variations. It's often challenging to reproduce results exactly across different environments and implementations, even with the same parameters.