dawenl / vae_cf

Variational autoencoders for collaborative filtering
Apache License 2.0
530 stars 158 forks source link

confused about split data #17

Open junkangwu opened 3 years ago

junkangwu commented 3 years ago

Hi, nice work about Variational Autoencoder on recommendation. However, I am confused about the method of data split. In the preprocessing.py,

unique_uid = user_activity.index

unique_uid is the index of active user rather than the uid (unique_uid['userId']). Owing to the filter operator before, some userId are moved out. Then some valid userId at the end will not be considered if we adopt the index of user_activity rather than the actual uid. I guess it might be a error or is there any other meaning of that?

Looking forward to your reply, Thanks. Best.