Open quenio opened 1 month ago
Thanks for reporting! I can't reproduce it on the GitHub Codespaces (with default 2 cores) that you seem to be using. Here's what I get after python3 -m pip install -r requirements.txt
using max 24.3.0 (9882e19d) Modular version 24.3.0-9882e19d-release
@ehsanmok ➜ /workspaces/max/examples/performance-showcase (main) $ python3 run.py -m roberta
Doing some one time setup. This takes 5 minutes or so, depending on the model.
Get a cup of coffee and we'll see you in a minute!
Done! [100%]
Starting inference throughput comparison
----------------------------------------System Info----------------------------------------
CPU: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Arch: X86_64
Clock speed: 2.8000 GHz
Cores: 2
Running with PyTorch
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.20k/1.20k [00:00<00:00, 6.79MB/s]
pytorch_model.bin: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 499M/499M [00:01<00:00, 326MB/s]
.......................................................................................... QPS: 4.81
Running with MAX Engine
Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ...Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ...Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ...Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ...Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ...Compiling model. .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... . .. ... .... ..... .
Done!
.......................................................................................... QPS: 5.49
====== Speedup Summary ======
MAX Engine vs PyTorch: That's about 1.14x faster.
I suggest to try again and perhaps on another Linux machine! Also note that a good speed up requires a more powerful machine with more cores. More details here.
Bug description
When running the example: performance-showcase I got the following after the model compilation:
Steps to reproduce
Running: python3 run.py -m roberta
System information