mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks
https://mlcommons.org/en/groups/inference
Apache License 2.0
1.17k stars 517 forks source link

CM error: no scripts were found with above tags and variations #1773

Closed kraza8 closed 1 month ago

kraza8 commented 1 month ago

Hi, I am kind of new to MLPerf and trying to use MLPerf Inference v4.0 NVIDIA-Optimized Implementations on a gh200.

Steps: python3 -m pip install cmind -U cm test core cm pull repo cknowledge@cm4mlops --branch=mlperf-inference cm rm cache -f

cm run script --tags=run-mlperf,inference,_performance-only,_short --division=closed --category=datacenter --device=cuda --model=gptj-99 --implementation=nvidia --backend=tensorrt --scenario=Offline --execution_mode=test --power=no --adr.python.version_min=3.8 --clean --compliance=no --quiet --time

Keep getting this error:

CM error: no scripts were found with above tags and variations

variation tags ['run_harness', 'cuda', 'gptj-99', 'offline', 'tensorrt', 'gpu_memory.96'] are not matching for the found script app-mlperf-inference-nvidia with variations dictkeys(['v4.0', 'v3.1', 'cpu', 'cuda', 'tensorrt', 'resnet50', 'retinanet', 'sdxl', 'bert', 'bert-99', 'bert-99.9', '3d-unet', '3d-unet-99', '3d-unet-99.9', 'rnnt', 'dlrm', 'dlrm-v2-99', 'dlrm-v2-99.9', 'gptj', 'gptj,build', 'gptj_,build_engine', 'gptj-99', 'gptj-99.9', 'batch_size.#', 'dla_batch_size.#', 'use_triton', 'prebuild', 'build', 'maxq', 'maxn', 'preprocess-data', 'preprocess_data', 'download-model', 'download_model', 'calibrate', 'build-engine', 'build_engine', 'singlestream', 'multistream', 'offline', 'server', 'run-harness', 'run_harness', 'build_engine_options.#', 'gpu_memory.16', 'gpu_memory.24', 'gpu_memory.8', 'gpu_memory.32', 'gpu_memory.40', 'gpu_memory.48', 'gpu_memory.80', 'singlestream,resnet50', 'multistream,resnet50', 'singlestream,runharness', 'gptj,run_harness', 'gpumemory.16,gptj,offline,run_harness', 'gpumemory.24,gptj,offline,run_harness', 'gpumemory.32,gptj,offline,run_harness', 'gpumemory.48,gptj,offline,run_harness', 'gpumemory.40,gptj,offline,run_harness', 'gpumemory.80,gptj,offline,run_harness', 'gpu_memory.16,sdxl,offline,run_harness', 'gpu_memory.24,sdxl,offline,run_harness', 'gpu_memory.32,sdxl,offline,run_harness', 'gpu_memory.80,sdxl,offline,run_harness', 'gpu_memory.96,sdxl,offline,run_harness', 'gpu_memory.96,sdxl,server,run_harness', 'gpu_memory.80,sdxl,server,run_harness', 'gpu_memory.140,sdxl,offline,run_harness', 'gpumemory.16,bert,offline,run_harness', 'gpumemory.24,bert,offline,run_harness', 'gpumemory.32,bert,offline,run_harness', 'gpumemory.48,bert,offline,run_harness', 'gpumemory.40,bert,offline,run_harness', 'gpumemory.80,bert,server,run_harness', 'gpu_memory.16,resnet50,offline,run_harness', 'gpu_memory.40,resnet50,offline,run_harness', 'gpu_memory.24,resnet50,offline,run_harness', 'gpu_memory.32,resnet50,offline,run_harness', 'gpu_memory.48,resnet50,offline,run_harness', 'gpu_memory.80,resnet50,offline,run_harness', 'num-gpus.#', 'num-gpus.1', 'resnet50,server,run_harness', 'resnet50,multistream,run_harness,num-gpus.1', 'resnet50,multistream,run_harness,num-gpus.2', 'retinanet,multistream,run_harness', 'gpu_memory.16,retinanet,offline,run_harness', 'gpu_memory.40,retinanet,offline,run_harness', 'gpu_memory.32,retinanet,offline,run_harness', 'gpu_memory.48,retinanet,offline,run_harness', 'gpu_memory.24,retinanet,offline,run_harness', 'gpu_memory.80,retinanet,offline,run_harness', 'retinanet,server,run_harness', 'gpu_memory.16,rnnt,offline,run_harness', 'gpu_memory.40,rnnt,offline,run_harness', 'gpu_memory.24,rnnt,offline,run_harness', 'gpu_memory.32,rnnt,offline,run_harness', 'gpu_memory.48,rnnt,offline,run_harness', 'gpu_memory.80,rnnt,offline,run_harness', 'gpumemory.16,3d-unet,offline,run_harness', 'gpumemory.40,3d-unet,offline,run_harness', 'gpumemory.24,3d-unet,offline,run_harness', 'gpumemory.80,3d-unet,offline,run_harness', 'gpumemory.32,3d-unet,offline,run_harness', 'gpumemory.48,3d-unet,offline,run_harness', 'gpumemory.16,dlrm,offline,run_harness', 'gpumemory.40,dlrm,offline,run_harness', 'gpumemory.24,dlrm,offline,run_harness', 'gpumemory.32,dlrm,offline,run_harness', 'gpumemory.48,dlrm,offline,run_harness', 'gpumemory.80,dlrm,offline,run_harness', 'orin', 'orin,rnnt,singlestream,run_harness', 'orin,sdxl,offline,run_harness', 'rtx_4090', 'rtx_4090,sdxl,offline,run_harness', 'rtx_4090,sdxl,server,run_harness', 'rtx_4090,resnet50,offline,run_harness', 'rtx_4090,resnet50,server,run_harness', 'rtx_4090,retinanet,offline,run_harness', 'rtx_4090,retinanet,server,run_harness', 'rtx4090,bert,offline,run_harness', 'rtx4090,bert,server,run_harness', 'rtx4090,3d-unet,offline,run_harness', 'rtx4090,3d-unet,server,run_harness', 'rtx_4090,rnnt,offline,run_harness', 'rtx_4090,rnnt,server,run_harness', 'rtx4090,gptj,offline,run_harness', 'rtx4090,gptj,server,run_harness', 'rtx4090,dlrm,offline,run_harness', 'a6000', 'rtx_a6000,resnet50,offline,run_harness', 'rtx_a6000,resnet50,server,run_harness', 'rtx_a6000,retinanet,offline,run_harness', 'rtx_a6000,retinanet,server,run_harness', 'rtxa6000,bert,offline,run_harness', 'rtxa6000,bert,server,run_harness', 'rtxa6000,3d-unet,offline,run_harness', 'rtxa6000,3d-unet,server,run_harness', 'rtx_a6000,rnnt,offline,run_harness', 'rtx_a6000,rnnt,server,run_harness', 'rtxa6000,dlrm,offline,run_harness', 'rtx_6000_ada', 'rtx_6000_ada,resnet50,offline,run_harness', 'rtx_6000_ada,resnet50,server,run_harness', 'rtx_6000_ada,retinanet,offline,run_harness', 'rtx_6000_ada,retinanet,server,run_harness', 'rtx_6000ada,bert,offline,run_harness', 'rtx_6000ada,bert,server,run_harness', 'rtx_6000ada,3d-unet,offline,run_harness', 'rtx_6000ada,3d-unet,server,run_harness', 'rtx_6000_ada,rnnt,offline,run_harness', 'rtx_6000_ada,rnnt,server,run_harness', 'rtx_6000ada,dlrm,offline,run_harness', 'l4', 'l4,sdxl,offline,run_harness', 'l4,sdxl,offline,run_harness,num-gpu.8', 'l4,sdxl,server,run_harness,num-gpu.1', 'l4,sdxl,server,run_harness,num-gpu.8', 'l4,resnet50', 'l4,resnet50,offline,run_harness', 'l4,resnet50,server,run_harness', 'l4,retinanet,offline,run_harness', 'l4,retinanet,server,runharness', 'l4,bert,offline,runharness', 'l4,bert,server,runharness', 'l4,3d-unet,offline,run_harness', 'l4,rnnt,offline,run_harness', 'l4,rnnt,server,runharness', 'l4,dlrm,offline,run_harness', 't4', 't4,resnet50', 't4,resnet50,offline,run_harness', 't4,resnet50,server,run_harness', 't4,retinanet,offline,run_harness', 't4,retinanet,server,runharness', 't4,bert,offline,runharness', 't4,bert,server,runharness', 't4,3d-unet,offline,run_harness', 't4,rnnt,offline,run_harness', 't4,rnnt,server,runharness', 't4,dlrm,offline,run_harness', 'pcie', 'sxm', 'custom', 'a100', 'a100,sxm,resnet50,offline,run_harness', 'a100,sxm,retinanet,offline,runharness', 'a100,sxm,bert,offline,runharness', 'a100,sxm,3d-unet,offline,run_harness', 'a100,sxm,rnnt,offline,runharness', 'a100,sxm,dlrm,offline,run_harness'])

Thanks for the help.

arjunsuresh commented 1 month ago

Hi @kraza8 can you please retry this now? We have just added the support for 96GB Nvidia GPUs.

Please do cm pull repo to get the updated changes and add --docker_cache=no to the cm run command.

kraza8 commented 1 month ago

Thanks Arjun, will give that a try.