arghosh / AKT

MIT License
93 stars 31 forks source link

statics data: some users occur in a same dataset more than once or occur in both training and validation/testing data #9

Open may248110 opened 2 years ago

may248110 commented 2 years ago

Examples:

  1. users 976, 909 occur in statics_train1.csv twice
  2. users 1024, 874 occur in both statics_valid1.csv and statics_train1.csv
  3. users 864, 357 occur in both statics_valid1.csv and statics_test1.csv.

The paper mentioned that train/valid/test is split based on users. If so, then the above should not happen, right?