I found dlrm_main.py supported two data set. The criteo_kaggle has smaller size.
parser.add_argument(
"--dataset_name",
type=str,
default="criteo_1t",
help="dataset for experiment, current support criteo_1tb, criteo_kaggle",
)
I downloaded the criteo_kaggle dataset from https://www.kaggle.com/datasets/mrkmakr/criteo-dataset. But it only contains two raw file - train.txt and test.txt. I am not sure how to process it for dlrm module to run. Could someone give me a hint?
I found dlrm_main.py supported two data set. The criteo_kaggle has smaller size. parser.add_argument( "--dataset_name", type=str, default="criteo_1t", help="dataset for experiment, current support criteo_1tb, criteo_kaggle", )
I downloaded the criteo_kaggle dataset from https://www.kaggle.com/datasets/mrkmakr/criteo-dataset. But it only contains two raw file - train.txt and test.txt. I am not sure how to process it for dlrm module to run. Could someone give me a hint?