Open sugarandgugu opened 2 months ago
Thank you for your interest in XRec! The dataset in Section 4.4 is organized based on the frequency of user appearances in the training data. Here are detailed steps to separate the data:
tst1
through tst5
.Notice that users who appear only in the test or validation datasets and not during the training process are considered zero-shot
users. We hope this helps clarify your concerns.
感谢您的回复,想请问如果使用PEPLER所提供的TripAdvisor数据集,应该按照什么步骤处理成你们论文所用的数据格式呢?
TripAdvisor lacks item descriptions, which sets it apart from our datasets. However, you can create descriptions yourself using a similar process to how we construct user descriptions. The approach involves feeding a LLM (e.g. gpt-3.5-turbo) with selected reviews that the item has received. The LLM then summarizes these interactions to determine the nature of the item and generates a concise sentence description. This method is equally applicable to generating user descriptions. We hope this would help.
恭喜你们的工作被EMNLP接受,想请问你们4.4节中这个数据集是怎么划分的,有具体代码或者数据集吗?