Open rajesh-s opened 3 weeks ago
@rajesh-s most of the inference submissions are done using Nvidia implementation. In CM we have tried to match the typical batch sizes as in the Nvidia submissions but we haven't tested all of the systems. In the CM run command you can specify --batch_size=
to use custom batch size for Nvidia implementation.
For reference implementation I'm not sure if different batch sizes work as many things are hardwired and no one has done any submission using it.
It would help if the batch sizes are listed atleast on the submissions which I could not find on the results.
The CM run command seems to default to the batch size of 1
as I indicated above, which might be good to note in the documentation. The results vary largely on the sizes and it maybe imperative to document them.
Hi @rajesh-s, sorry for replying late. Have noted the required addition.
@arjunsuresh , would it be apt to include in collapsible section or should we give that as a tip since there is a chance of users ignoring the collapsible option.
Hi @rajesh-s , we have added the changes in our forks but its yet to be merged to MLCommons official inference repo. You can find the changes here.
I could not find information either in the documentation or in the cm scripts on the batch size that is being used to report the results in the MLCommons database.
1
. Is the cm automation specifying a value different from this?submissions
, should I see nearly the same performance?