How to obtain the Instruments.index codebook files? - Githubissues

RUCAIBox / LC-Rec

[ICDE'24] Code of "Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation."

75 stars 7 forks source link

How to obtain the Instruments.index codebook files? #14

Open mitao-cat opened 2 months ago

mitao-cat commented 2 months ago

作者您好，请问一下您的google drive中的.index（item编码文件）是怎么得到的？我尝试按论文实验setting部分和本仓库结构复现物品编码，经过了如下步骤：

将amazon_text_emb.py的115行的plm_checkpoint设为huggingface的huggyllama/llama-7B并运行，生成dataset.emb-llama-td.npy
运行run.sh，生成RQ-VAE的ckpt（包含best_loss_model.pth和best_collision_model.pth）
将generate_indices.py的line45设置为best_loss_model.pth然后运行，生成dataset.index

上述步骤如无特殊说明均使用默认参数。这样生成的物品编码分布和您提供的编码分布有差距，并且最终推荐效果有下降。想请教一下上面的步骤哪里需要修改，才能得到和google drive中相似的码本？十分感谢！！！

zhengbw0324 commented 2 months ago

@mitao-cat 您好！我们在实验中并没有严格使用best_loss_model.pth或者best_collision_model.pth，而是综合loss和collision在最后的几个ckpt中选择一个进行索引生成。另外，目前的RQ-VAE实现相比于最初版本进行了一些改变，如训练时使用lr_scheduler，因此获得的码本确实无法于google drive中完全一样。