Leavingseason / xDeepFM

746 stars 219 forks source link

some questions #4

Open sshzhang opened 6 years ago

sshzhang commented 6 years ago

Can you explain the meaning of the dataset. I am a little confused

Leavingseason commented 6 years ago

Sorry for the late response,. Did you mean the data format? It's the field-wise format, in the form of FieldID:featureID:featureValue. A field is a group of features, such as gender, location, oppucation, etc.

sshzhang commented 6 years ago

Thank you very much!

sshzhang commented 6 years ago

after reading the article in detail . I also find some problem. can you explain me about how to preprocessing the Criteo Dataset when you do experiment. I want to run the mode CIN in Criteo Dataset , but I don't know how to preprocessing the datasets. Another question is the dataset in that program is ariticial ? Thank you!

Leavingseason commented 6 years ago

Criteo dataset is frequently used by research groups. I think there are no too much ways to preprocess the dataset, just transform the numerical values into categorical values (by log 2) and filter out some low frequent categorical values. You can leave an email address and I will send you our scripts. The sample dataset in github is a real-world dataset.

sshzhang commented 6 years ago

Thanks ! Here is my email 1564752861@qq.com

cowry5 commented 6 years ago

Hi, I have the same problem now. would you email me the scripts ? Thanks! Here is my email 605024106@qq.com

anzhizh commented 6 years ago

I also need the scripts, and the size of criteo dataset in your experience is 45mb? Thank you very much!Here is my email 1300157732@qq.com

Leavingseason commented 6 years ago

Since the criteo script seems to block a few readers, I have uploaded the script to the codebase.

Decalogue commented 5 years ago

Happy Lantern Festival ! I also need the scripts and want to know the tensorflow 1.12.0 is OK? Thanks! Here is my email 1044908508@qq.com

wanesta commented 5 years ago

Thank you very much! gaoshuming121@163.com

wenjuanxu commented 5 years ago

Hi, I have the same problem now. would you email me the scripts ? Thanks! Here is my email 1067887327@qq.com

ccfccl commented 5 years ago

Hello, I want to know how the movielens data set is processed to meet the required format, thank you