mlcommons / inference_results_v3.0

This repository contains the results and code for the MLPerf™ Inference v3.0 benchmark.
https://mlcommons.org/en/inference-datacenter-30/
Apache License 2.0
18 stars 15 forks source link

Issues while preprocessing the DLRM dataset #7

Open lvaidya2910 opened 1 year ago

lvaidya2910 commented 1 year ago

I am finding issues with preprocessing Criteo Terabyte Dataset for the inference runs for Intel CPU. When the readme file talks about to prepare the DLRM dataset,

   Create a directory (such as /data/mlperf_data/dlrm/) which contain:
     day_fea_count.npz
     terabyte_processed_test.bin

   About how to get the dataset, please refer to
      https://github.com/facebookresearch/dlrm

I tried to follow through the instructions in the https://github.com/facebookresearch/dlrm repo. After following through this command, I found issues. ./bench/dlrm_s_criteo_terabyte.sh ["--test-freq=10240 --memory-map --data-sub-sample-rate=0.875"] What are the best commands to get the dataset preprocessed and get the inference runs started.

nv-ananjappa commented 1 year ago

@rnaidu02 Could you help?

rnaidu02 commented 1 year ago

@lvaidya2910 Please share the error log to isolate the step where the failure occurred. Were you able to download Criteo data set?