HKUDS / XRec

[EMNLP'2024] "XRec: Large Language Models for Explainable Recommendation"
http://arxiv.org/abs/2406.02377
Apache License 2.0
101 stars 9 forks source link

如果想把用户的标签都加到user数据里面,应该怎么加呢? #4

Closed qiaohanqing closed 3 weeks ago

qiaohanqing commented 2 months ago

整理数据的过程发现,user数据里面只有购买店铺的相关数据,想把用户的一些基础数据和一些已经整理的其他标签注入用户数据里面应该怎么加进去呢?

Martin-qyma commented 2 months ago

Thank you for your interest in XRec! Here are two methods to enhance user information:

  1. You can modify the generation/user_profile/user_system_prompt.txt file to provide the LLM with additional user data, which will help refine the user profile. This method enables the model to integrate new information seamlessly and improves the user profile without the need for extensive changes to the underlying architecture.
  2. Alternatively, you can directly input your external data into user_message in explainer/utils/data_handler.py to finetune XRec. The advantage here is the flexibility to inject highly relevant and contextual information that might not be covered by the general user profile.

We hope you find this helpful.

qiaohanqing commented 2 months ago

Thank you for your interest in XRec! Here are two methods to enhance user information:

  1. You can modify the generation/user_profile/user_system_prompt.txt file to provide the LLM with additional user data, which will help refine the user profile. This method enables the model to integrate new information seamlessly and improves the user profile without the need for extensive changes to the underlying architecture.
  2. Alternatively, you can directly input your external data into user_message in explainer/utils/data_handler.py to finetune XRec. The advantage here is the flexibility to inject highly relevant and contextual information that might not be covered by the general user profile.

We hope you find this helpful.

谢谢,我运行期间发现有./source/business.json,./business_id_mapping.json等json文件,但是表里文件夹里并没有这些文件,有可以寻找的地方吗?

Martin-qyma commented 2 months ago

The files mentioned are part of the preprocessing step, which you can skip since the processed files are already provided. However, if you prefer to handle this step yourself, you can follow the instructions in process/README.md. This guide will walk you through generating the necessary files.

Specifically, business.json and review.json are the only source files you need to prepare. These can be directly downloaded from any open-source dataset. The business_id_mapping.json and other JSON files are generated from these two source files. Detailed instructions can be found in process/README.md.

We hope this clarifies your concern.