Files generation on a new dataset

dongyuanjushi / LightLM

22 stars 2 forks source link

Files generation on a new dataset #1

Closed HanbingWang2001 closed 10 months ago

HanbingWang2001 commented 10 months ago

Hi,

Thanks for the great work! I wanted to run the code on another dataset. It says: ''../data/ml-1m/co_CF_indices/item_deduplicated_c50_100_CF_index.json' is missing.

I guess it is generated by 'indexing/co_CID_generation.py'. However, running 'co_CID_generation.py' also raises the error: FileNotFoundError: [Errno 2] No such file or directory: '../data/ml-1m/user_CF_indices/data.txt'

Could you tell me how to generate these relevant files correctly? Thanks!

dongyuanjushi commented 10 months ago

Thank you for your attention to our work. To run the code for another dataset (e.g., ml-1m), currently you need to run the preprocess/sequential_generation.ipynb to generate the data.txt and other relevant files under the user_CF_indices directory. And then you can run the indexing/co_CID_generation.py to get the collaborative IDs. I will further update the instruction for running on other datasets to make it more clear. Thanks!

HanbingWang2001 commented 10 months ago

Got it, Thanks again!