Could you please provide the preprocessing procedure for datasets other than ml-1m?

noveens / distill_cf

[ NeurIPS '22 ] Data distillation for recommender systems. Shows equivalent performance with 2-3 orders less data.

MIT License

22 stars 2 forks source link

Could you please provide the preprocessing procedure for datasets other than ml-1m? #2

Closed WanliYoung closed 1 year ago

WanliYoung commented 1 year ago

Hello @noveens,

Thank you for your excellent work! It appears that the preprocessing procedure in the repository is only capable of handling ml-1m dataset. Could you kindly share the code for preprocessing other datasets as well? Thank you very much!

WanliYoung commented 1 year ago

Furthermore, could you share the rating files of the Amazon Magazine and Douban datasets that you used. I'm concerned that the data files I found may differ from your version, which could lead to inconsistent results. Thank you very much!

noveens commented 1 year ago

Hello @wending0417,

My apologies that this notification got buried!

I'll be happy to share the (i) preprocessing scripts, and (ii) preprocessed data for the datasets you mentioned. I'm running busy on a few things but I'll make sure to share this by 1-2 days.

Do let me know if you have any other questions/concerns!

Best, Noveen

noveens commented 1 year ago

Hey @wending0417,

I added the preprocessed data directly onto GitHub as you requested. Please feel free to let me know if there's anything else I can help with.

I'm closing the issue now but feel free to re-open/comment!

Best, Noveen