facebookresearch / dlrm

An implementation of a deep learning recommendation model (DLRM)
MIT License
3.77k stars 837 forks source link

Saving convertDicts on data_utils.py seems necessary #23

Closed 2seungeun closed 5 years ago

2seungeun commented 5 years ago

I think the dictionary to convert categories in test.txt (kaggle_data) is necessary. It would be better if there is an example on how to use kaagle's test.txt after training kaggle's train.txt even though current example uses only train.txt to split train, val, and test dataset.

hjmshi commented 5 years ago

We have not included an implementation for loading test.txt because the test set provided by Kaggle does not provide labels. From my understanding, most of the papers written on recommendation and personalization models simply split train.txt to perform their experiments.

If you would like to load the test.txt file, then you will have to modify the data_utils.py function to read that file with the same dictionary as provided from train.txt. Note that the labels will not be provided, so one will have to remove reading that part of each line.