关于数据集的问题，Amazon Book数据集您用的是2018年的吗？还是2014年的吗？

GoodTimeNET commented 7 months ago

GuillaumeSalhaGalvan commented 7 months ago

你好，

我们在这篇文章中使用了数据集 "Amazon Book 2014 - Rating only - Small dataset"。

祝您的项目顺利，

Guillaume

GoodTimeNET commented 7 months ago

谢谢您的回复，您的论文给了我很大的启发，祝您工作顺利。

GoodTimeNET commented 7 months ago

您好请问是这个链接中的数据集吗？ https://snap.stanford.edu/data/amazon/productGraph/categoryFiles/ratings_Books.csv，因为我使用了该链接中的数据集根据您所说的实验步骤，也就是 To run experiment: Download datasets and put them in the exp/data directory. For example exp/data/amazon for Amazon book Change data path and interaction file name in configuration file (for example configs/amazon.json). Run experiment script (that contains both train and evaluation commands) in scripts directory 实验跑完过后，我惊讶的发现，Amazon Book 的指标NDCG达到了63，HR达到了86，这与您论文中结果相比，好的太多了，这使我不知所措，我不知道该怎么应对这种情况。难道我上述链接中的数据集不是"Amazon Book 2014 - Rating only - Small dataset"吗？我的实验环境与您所要求的环境是大概一致的。

GoodTimeNET commented 7 months ago

您好请问是这个链接中的数据集吗？ https://snap.stanford.edu/data/amazon/productGraph/categoryFiles/ratings_Books.csv，因为我使用了该链接中的数据集根据您所说的实验步骤，也就是 To run experiment: Download datasets and put them in the exp/data directory. For example exp/data/amazon for Amazon book Change data path and interaction file name in configuration file (for example configs/amazon.json). Run experiment script (that contains both train and evaluation commands) in scripts directory 实验跑完过后，我惊讶的发现，Amazon Book 的指标NDCG达到了63，HR达到了86，这与您论文中结果相比，好的太多了，这使我不知所措，我不知道该怎么应对这种情况。难道我上述链接中的数据集不是"Amazon Book 2014 - Rating only - Small dataset"吗？我的实验环境与您所要求的环境是大概一致的。

GoodTimeNET commented 7 months ago

问题已经解决。

GuillaumeSalhaGalvan commented 7 months ago

好消息，祝你的项目顺利！如果你认为这对未来的读者有用，请不要犹豫分享你是如何解决这个问题的。

arnold-em commented 7 months ago

man！！！！！what can i say？

---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年04月11日 10:34 | | 收件人 | @.> | | 抄送至 | @.***> | | 主题 | Re: [deezer/sigir23-mojito] 关于数据集的问题，Amazon Book数据集您用的是2018年的吗？还是2014年的吗？ (Issue #4) |

Reopened #4.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

GoodTimeNET commented 7 months ago

上一个问题是我个人失误，挺尴尬的，就不分享了，哈哈哈。还有一个问题就是，关于LFM-1b 数据集的在调用脚本 python data_misc/kcore_interactions.py 时出现的问题，当前github中configs/lfm1b.json 是有问题的，至少我发现其中 col_names 是错误的，但我即使对它进行了修改，python data_misc/kcore_interactions.py 的调用结果依然是 2024-04-11 09:07:01,228:INFO:mojito:Number of users: 0 2024-04-11 09:07:01,228:INFO:mojito:Number of items: 0 2024-04-11 09:07:01,235:INFO:mojito:Number of interactions: 0 所以请你帮忙修复一下 configs/lfm1b.json ，以确保它能正确工作。

谢谢您之前的回复，您的回复鼓励了我继续从事着项工作，祝您工作顺利，身体健康。

好消息，祝你的项目顺利！如果你认为这对未来的读者有用，请不要犹豫分享你是如何解决这个问题的。

GuillaumeSalhaGalvan commented 7 months ago

由于没有查看具体的代码实现，我难以给出一个完全准确的答案。不过，您可以尝试以下两点来解决问题：

确认数据是否正确加载。
调整 "k-core processing" 中的 "k" 值，若数值过大可能会过滤掉所有数据。

GoodTimeNET commented 7 months ago

由于没有查看具体的代码实现，我难以给出一个完全准确的答案。不过，您可以尝试以下两点来解决问题：

确认数据是否正确加载。

调整 "k-core processing" 中的 "k" 值，若数值过大可能会过滤掉所有数据。

我当前已经把配置文件中的配置修改为 "u_ncore": 20, "i_ncore": 30，但依然不行。

GoodTimeNET commented 6 months ago

由于没有查看具体的代码实现，我难以给出一个完全准确的答案。不过，您可以尝试以下两点来解决问题：

确认数据是否正确加载。

调整 "k-core processing" 中的 "k" 值，若数值过大可能会过滤掉所有数据。

仍然是关于LFM-1B这个数据集的问题，我尝试了各种办法，甚至把LFM-1B使用了源代码中BOOK数据集的处理方式，虽然代码可以跑起来，但结果数据集跑出的结果是错误的。并且我肯定 LFM-1B的数据预处理是不完整的，所以请你们整理一下LFM-1b的预处理代码可以吗？或者是LFM-1b数据集的加载代码，这里面会加载很多预处理没有涉及的文件。

GoodTimeNET commented 6 months ago

由于没有查看具体的代码实现，我难以给出一个完全准确的答案。不过，您可以尝试以下两点来解决问题：

确认数据是否正确加载。

调整 "k-core processing" 中的 "k" 值，若数值过大可能会过滤掉所有数据。

大佬我现在就差这一个数据集了，能把最后一个数据集LFM-1B的预处理补全吗

bahareharandizade commented 4 months ago

Hello, thanks for sharing this amazing code, I learned a lot by reviewing it. I have question about about lambda_glob, while in the paper you defined L = lambda (L_short) + (1-lambda) (L_long), in the code in _create_loss() function in models.net, you condition on lambda > 0, but at the end you defined loss as : self.loss = loc_loss + self.lambda_glob * glob_loss

shouldn't be :

self.loss = (1-self.lambda_glob) loc_loss + self.lambda_glob glob_loss instead?

In other words, if I just want to consider relevant score from long-term, and completely ignore short-term relevance score, how can I change this?

Many thanks for your time and this amazing work

deezer / sigir23-mojito

关于数据集的问题，Amazon Book数据集您用的是2018年的吗？还是2014年的吗？ #4