Data preprocess - Githubissues

HKUDS / RLMRec

[WWW'2024] "RLMRec: Representation Learning with Large Language Models for Recommendation"

Apache License 2.0

343 stars 44 forks source link

Hi 👋!

Thanks for your interests on RLMRec! Due to the complexity of the pre-processing code and its multi-file structure, it might be more helpful to provide a straightforward overview of the basic workflow for the pre-processing steps, as outlined below:

Score Filtering: Begin by filtering out low-score interactions (implemented using a for loop).
User Sampling (Discussion in Issue 9) Next, uniformly sample a ratio of users and remove items that have not been interacted with after filtering the users. This will help reduce the dataset size (implemented with boolean vectors).
K-Core Filtering: Finally, apply k-core filtering using the NetworkX library.

I hope the above answer is helpful to you :)

Best regards, Xubin

HKUDS / RLMRec

Data preprocess #14