modularml / max

A collection of sample programs, notebooks, and tools which highlight the power of the MAX Platform
https://www.modular.com
Other
201 stars 31 forks source link

Replying to your request ... #148

Closed delhub closed 2 months ago

delhub commented 2 months ago

Responding to: "Hold on a tick... We normally see speedups of roughly 1.20x on PyTorch for roberta on X86_64. Honestly, we would love to hear from you to learn more about the system you're running on!". Let me know if you need anything more than this:

python3 run.py -m roberta Doing some one time setup. This takes 5 minutes or so, depending on the model. Get a cup of coffee and we'll see you in a minute!

Done! [100%]

Starting inference throughput comparison

----------------------------------------System Info---------------------------------------- CPU: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Arch: X86_64 Clock speed: 2.6000 GHz Cores: 8

Running with TensorFlow .......................................................................................... QPS: 4.54

Running with PyTorch .......................................................................................... QPS: 5.14

Running with MAX Engine Compiling model. Done! .......................................................................................... QPS: 6.15

====== Speedup Summary ======

MAX Engine vs TensorFlow: That's about 1.35x faster. MAX Engine vs PyTorch: That's about 1.20x faster.

ehsanmok commented 2 months ago

Thanks! good to hear. Is there any specific question or feedback on this?

delhub commented 2 months ago

Not particularly at this time. Perhaps I was reading too much into it. The statement implied, to me, that there might be something wrong or that the performance number could be better - even though it was exactly the same as it implied. No worries. You can close this "issue".