Closed mikowals closed 2 months ago
@mikowals thanks for the bug report. We default to number of P cores on Linux x86 starting from MAX 24.2 release. For Ubuntu Docker on Mac ,you have to manually set the default cores to performance cores, MAX support for Mac is coming soon at which point, default cores will match P cores. Am going to close this issue now, please feel free to open if you have further questions.
@goldiegadde, trying out Max on MacOS Apple Silicon with nightly and I still need to set performance cores manually. Makes performance go from "Oh, darn that's only 0.64x stock performance" to "That's about 1.06x faster" on roberta
.
Bug description
Manually adjusting run_max.py with
engine.InferenceSession(10)
got a speedup from 18.28 -> 25.82 QPS onroberta
and from 29.15 -> 54.14 QPS onclip
. My machine has 14 cores but only 10 performance cores so I think that accounts for the speedup.So this could be done in run_max.py as I have done but the better option is for InferenceSession default to performance cores when no argument is provided.
Steps to reproduce
as above
System information