mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks
https://mlcommons.org/en/groups/inference
Apache License 2.0
1.13k stars 497 forks source link

Get error message "unrecognized arguments: rocm" when running mlperf inference on ubuntu with rocm #1729

Open jerryzhaoc opened 2 weeks ago

jerryzhaoc commented 2 weeks ago

The command used to run mlperf inference for resnet50 model on ubuntu with rocm is below: cm run script --tags=run-mlperf,inference \ --model=resnet50 \ --implementation=reference \ --framework=tensorflow \ --category=edge \ --scenario=Offline \ --execution-mode=valid \ --device=rocm \ --quiet

The error message and log infrmation are below: CM script::benchmark-program/run.sh

Run Directory: /disk1/jerry.zhao/CM/repos/local/cache/5cdd269f823046d5/inference/vision/classification_and_detection

CMD: ./run_local.sh onnxruntime resnet50 rocm --scenario Offline --mlperf_conf '/disk1/jerry.zhao/CM/repos/local/cache/5cdd269f823046d5/inference/mlperf.conf' --threads 16 --user_conf '/disk1/jerry.zhao/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/a40cae1d70c649feac756b3543fc8808.conf' --use_preprocessed_dataset --cache_dir /disk1/jerry.zhao/CM/repos/local/cache/3f75b8874c5d4ed1 --dataset-list /disk1/jerry.zhao/CM/repos/local/cache/66248d197b4c4232/data/val.txt 2>&1 | tee /disk1/jerry.zhao/CM/valid_results/yz_adm5-reference-rocm-onnxruntime-vdefault-default_config/resnet50/offline/performance/run_1/console.out

     ! cd /disk1/jerry.zhao/CM
     ! call /disk1/jerry.zhao/CM/repos/mlcommons@cm4mlops/script/benchmark-program/run-ubuntu.sh from tmp-run.sh

python3 python/main.py --profile resnet50-onnxruntime --mlperf_conf ../../mlperf.conf --model "/disk1/jerry.zhao/CM/repos/local/cache/71e2a9b6a8504033/resnet50_v1.onnx" --dataset-path /disk1/jerry.zhao/CM/repos/local/cache/3f75b8874c5d4ed1 --output "/disk1/jerry.zhao/CM/valid_results/yz_adm5-reference-rocm-onnxruntime-vdefault-default_config/resnet50/offline/performance/run_1" rocm --scenario Offline --mlperf_conf /disk1/jerry.zhao/CM/repos/local/cache/5cdd269f823046d5/inference/mlperf.conf --threads 16 --user_conf /disk1/jerry.zhao/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/a40cae1d70c649feac756b3543fc8808.conf --use_preprocessed_dataset --cache_dir /disk1/jerry.zhao/CM/repos/local/cache/3f75b8874c5d4ed1 --dataset-list /disk1/jerry.zhao/CM/repos/local/cache/66248d197b4c4232/data/val.txt usage: main.py [-h] [--dataset {imagenet,imagenet_mobilenet,imagenet_pytorch,coco-300,coco-300-pt,openimages-300-retinanet,openimages-800-retinanet,openimages-1200-retinanet,openimages-800-retinanet-onnx,coco-1200,coco-1200-onnx,coco-1200-pt,coco-1200-tf}] --dataset-path DATASET_PATH [--dataset-list DATASET_LIST] [--data-format {NCHW,NHWC}] [--profile {defaults,resnet50-tf,resnet50-pytorch,resnet50-onnxruntime,resnet50-ncnn,mobilenet-tf,mobilenet-onnxruntime,ssd-mobilenet-tf,ssd-mobilenet-pytorch,ssd-mobilenet-onnxruntime,ssd-resnet34-tf,ssd-resnet34-pytorch,ssd-resnet34-onnxruntime,ssd-resnet34-onnxruntime-tf,retinanet-pytorch,retinanet-onnxruntime}] [--scenario SCENARIO] [--max-batchsize MAX_BATCHSIZE] --model MODEL [--output OUTPUT] [--inputs INPUTS] [--outputs OUTPUTS] [--backend BACKEND] [--model-name MODEL_NAME] [--threads THREADS] [--qps QPS] [--cache CACHE] [--cache_dir CACHE_DIR] [--preprocessed_dir PREPROCESSED_DIR] [--use_preprocessed_dataset] [--accuracy] [--find-peak-performance] [--debug] [--mlperf_conf MLPERF_CONF] [--user_conf USER_CONF] [--audit_conf AUDIT_CONF] [--time TIME] [--count COUNT] [--performance-sample-count PERFORMANCE_SAMPLE_COUNT] [--max-latency MAX_LATENCY] [--samples-per-query SAMPLES_PER_QUERY] main.py: error: unrecognized arguments: rocm ! call "postprocess" from /disk1/jerry.zhao/CM/repos/mlcommons@cm4mlops/script/benchmark-program/customize.py

arjunsuresh commented 1 week ago

Hopefully this PR should fix it. Unfortunately we don't have access to an AMD GPU to test this.