mlcommons / cm4mlops

A collection of portable, reusable and cross-platform automation recipes (CM scripts) to make it easier to build and benchmark AI systems across diverse models, data sets, software and hardware
http://docs.mlcommons.org/cm4mlops/
Apache License 2.0
15 stars 20 forks source link

limit mlperf inference samples using cm #531

Closed Arman5592 closed 2 weeks ago

Arman5592 commented 2 weeks ago

Hello, I hope you are doing well.

I intend to run resnet50 in the server scenario (datacenter) using the script in the docs:

cm run script --tags=run-mlperf,inference,_r4.1-dev \
   --model=resnet50 \
   --implementation=reference \
   --framework=onnxruntime \
   --category=datacenter \
   --scenario=Server\
   --server_target_qps=8 \
   --execution_mode=valid \
   --device=cpu \
   --quiet

I also add --samples=<some number> to limit the number of samples, to speed up the run - but this has no evident effect.

Is there a way to limit the number of samples when running with cm?

Thanks!

arjunsuresh commented 2 weeks ago

Hi @Arman5592 you can do this by switching to test mode as follows.

--execution_mode=test --test_query_count=100

In execution_mode=valid, loadgen determines the number of samples from the provided target_qps.

Arman5592 commented 2 weeks ago

Thank you very much @arjunsuresh .

Is there any way of reducing the runtime of the server scenario for a given qps, by running it shorter?

(This is not for an official submission / research manuscript, but a quick comparison of two private machines)

arjunsuresh commented 2 weeks ago

Hi @Arman5592 yes, you can use "--execution_mode=test --test_query_count=100 --server_target_qps=8 --env.CM_MLPERF_MAX_DURATION_TEST=100" if we want to run for 100 seconds. Test query count is redundant here - we will try to remove it soon.

Arman5592 commented 2 weeks ago

Thank you very much for your help!