hidet-org / hidet

An open-source efficient deep learning framework/compiler, written in python.
https://hidet.org
Apache License 2.0
646 stars 52 forks source link

[Tracking Issue] Benchmarks #154

Open yaoyaoding opened 1 year ago

yaoyaoding commented 1 year ago

This issue tracks the performance benchmarks of hidet vs. other dynamo backends in pytorch.

The benchmark scripts that produce these report are located at hidet/scripts/bench.

yaoyaoding commented 1 year ago

2023-06-24

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.223 1.069 1.107 0.734
resnet50 f16[1,3,224,224] 1.408 1.092 1.112 0.482
model/bert-base-uncased f32, bs=1, seq=128 2.025 1.863 1.698 1.164
model/bert-base-uncased f16, bs=1, seq=128 1.951 0.712 0.798 0.894

Time: 2.18 hours

yaoyaoding commented 1 year ago

2023-06-25

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.247 1.066 1.077 0.729
resnet50 f16[1,3,224,224] 1.442 1.096 1.109 0.476
model/bert-base-uncased f32, bs=1, seq=128 1.958 1.867 1.700 1.163
model/bert-base-uncased f16, bs=1, seq=128 1.927 0.713 0.799 0.891

Time: 2.19 hours

yaoyaoding commented 1 year ago

2023-06-26

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.245 1.103 1.081 0.738
resnet50 f16[1,3,224,224] 1.410 1.079 1.078 0.477
model/bert-base-uncased f32, bs=1, seq=128 2.054 1.865 1.704 1.160
model/bert-base-uncased f16, bs=1, seq=128 1.932 0.713 0.800 0.895

Time: 2.19 hours

yaoyaoding commented 1 year ago

2023-06-27

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.230 1.088 1.078 0.731
resnet50 f16[1,3,224,224] 1.414 1.093 1.103 0.475
model/bert-base-uncased f32, bs=1, seq=128 2.045 1.867 1.697 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.904 0.711 0.797 0.891

Time: 2.18 hours

yaoyaoding commented 1 year ago

2023-06-28

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.211 1.062 1.092 0.732
resnet50 f16[1,3,224,224] 1.420 1.095 1.115 0.477
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.863 1.682 1.289
model/bert-base-uncased f16, bs=1, seq=128 1.934 0.711 0.798 0.896

Time: 2.16 hours

yaoyaoding commented 1 year ago

2023-06-29

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.219 1.117 1.087 0.732
resnet50 f16[1,3,224,224] 1.421 1.077 1.095 0.526
model/bert-base-uncased f32, bs=1, seq=128 2.053 1.862 1.696 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.909 0.712 0.799 0.886

Time: 2.24 hours

yaoyaoding commented 1 year ago

2023-06-30

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.221 1.055 1.076 0.735
resnet50 f16[1,3,224,224] 1.428 1.084 1.096 0.526
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.870 1.705 1.162
model/bert-base-uncased f16, bs=1, seq=128 1.894 0.712 0.799 0.883

Time: 2.25 hours

yaoyaoding commented 1 year ago

2023-07-01

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.227 1.044 1.084 0.737
resnet50 f16[1,3,224,224] 1.407 1.082 1.101 0.528
model/bert-base-uncased f32, bs=1, seq=128 2.054 1.867 1.699 1.285
model/bert-base-uncased f16, bs=1, seq=128 1.913 0.713 0.799 0.885

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-02

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.212 1.057 1.101 0.739
resnet50 f16[1,3,224,224] 1.432 1.089 1.110 0.523
model/bert-base-uncased f32, bs=1, seq=128 2.054 1.863 1.707 1.277
model/bert-base-uncased f16, bs=1, seq=128 1.891 0.712 0.798 0.880

Time: 2.27 hours

yaoyaoding commented 1 year ago

2023-07-03

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.227 1.076 1.110 0.734
resnet50 f16[1,3,224,224] 1.401 1.080 1.087 0.527
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.773 1.688 1.164
model/bert-base-uncased f16, bs=1, seq=128 1.921 0.715 0.799 0.881

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-04

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.236 1.065 1.088 0.736
resnet50 f16[1,3,224,224] 1.398 1.092 1.085 0.524
model/bert-base-uncased f32, bs=1, seq=128 1.978 1.789 1.701 nan
model/bert-base-uncased f16, bs=1, seq=128 1.899 0.711 0.798 nan

Time: 1.97 hours

yaoyaoding commented 1 year ago

2023-07-05

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.233 1.064 1.071 0.733
resnet50 f16[1,3,224,224] 1.409 1.067 1.094 0.529
model/bert-base-uncased f32, bs=1, seq=128 2.040 1.856 1.695 1.167
model/bert-base-uncased f16, bs=1, seq=128 1.924 0.710 0.797 0.883

Time: 2.28 hours

yaoyaoding commented 1 year ago

2023-07-06

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.220 1.076 1.099 0.733
resnet50 f16[1,3,224,224] 1.420 1.088 1.114 0.526
model/bert-base-uncased f32, bs=1, seq=128 1.972 1.868 1.702 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.862 0.715 0.801 0.882

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-07

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.221 1.065 1.083 0.733
resnet50 f16[1,3,224,224] 1.429 1.099 1.108 0.525
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.869 1.704 1.293
model/bert-base-uncased f16, bs=1, seq=128 1.906 0.710 0.797 0.938

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-08

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.232 1.061 1.077 0.728
resnet50 f16[1,3,224,224] 1.398 1.097 1.099 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.865 1.698 1.189
model/bert-base-uncased f16, bs=1, seq=128 1.893 0.712 0.798 0.884

Time: 2.25 hours

yaoyaoding commented 1 year ago

2023-07-09

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.225 1.057 1.076 0.730
resnet50 f16[1,3,224,224] 1.397 1.086 1.107 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.042 1.865 1.699 1.284
model/bert-base-uncased f16, bs=1, seq=128 1.978 0.712 0.796 0.883

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-10

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.250 1.081 1.097 0.731
resnet50 f16[1,3,224,224] 1.412 1.088 1.114 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.803 1.699 1.180
model/bert-base-uncased f16, bs=1, seq=128 1.936 0.712 0.799 0.881

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-11

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.231 1.042 1.075 0.730
resnet50 f16[1,3,224,224] 1.393 1.082 1.091 0.522
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.866 1.703 1.159
model/bert-base-uncased f16, bs=1, seq=128 1.909 0.710 0.797 0.882

Time: 2.25 hours

yaoyaoding commented 1 year ago

2023-07-15

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.947 1.832 1.865 0.729
resnet50 f16[1,3,224,224] 4.156 3.690 3.621 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.056 1.866 1.700 1.161
model/bert-base-uncased f16, bs=1, seq=128 1.732 0.713 0.800 0.882

Time: 2.46 hours

yaoyaoding commented 1 year ago

2023-07-16

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.944 1.857 1.864 0.734
resnet50 f16[1,3,224,224] 3.949 3.638 3.648 0.518
model/bert-base-uncased f32, bs=1, seq=128 2.041 1.865 1.704 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.733 0.712 0.799 0.880

Time: 2.47 hours

yaoyaoding commented 1 year ago

2023-07-18

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.949 1.852 1.857 0.732
resnet50 f16[1,3,224,224] 4.102 3.609 3.876 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.815 1.707 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.718 0.714 0.801 0.855

Time: 2.48 hours

yaoyaoding commented 1 year ago

2023-07-19

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.967 1.858 1.868 0.728
resnet50 f16[1,3,224,224] 4.233 3.880 3.755 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.038 1.865 1.673 1.150
model/bert-base-uncased f16, bs=1, seq=128 1.743 0.712 0.796 0.855

Time: 2.47 hours

yaoyaoding commented 1 year ago

2023-07-20

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.918 1.852 1.863 0.736
resnet50 f16[1,3,224,224] 3.973 3.766 3.647 0.522
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.865 1.701 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.753 0.710 0.796 0.859

Time: 2.47 hours

yaoyaoding commented 1 year ago

2023-07-21

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.865 1.865 0.731
resnet50 f16[1,3,224,224] 3.967 3.760 3.638 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.057 1.838 1.698 1.142
model/bert-base-uncased f16, bs=1, seq=128 1.725 0.712 0.801 0.865

Time: 2.47 hours

yaoyaoding commented 1 year ago

2023-07-22

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.943 1.854 1.866 0.728
resnet50 f16[1,3,224,224] 4.128 3.874 3.752 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.829 1.683 1.151
model/bert-base-uncased f16, bs=1, seq=128 1.726 0.712 0.797 0.849

Time: 2.44 hours

yaoyaoding commented 1 year ago

2023-07-23

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.937 1.853 1.860 0.730
resnet50 f16[1,3,224,224] 3.972 3.765 3.752 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.006 1.867 1.701 1.146
model/bert-base-uncased f16, bs=1, seq=128 1.733 0.713 0.800 0.856

Time: 2.48 hours

yaoyaoding commented 1 year ago

2023-07-24

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.853 1.863 0.728
resnet50 f16[1,3,224,224] 3.934 3.743 3.658 0.524
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.840 1.701 1.145
model/bert-base-uncased f16, bs=1, seq=128 1.729 0.711 0.799 0.858

Time: 2.48 hours

yaoyaoding commented 1 year ago

2023-07-25

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.926 1.854 1.807 0.728
resnet50 f16[1,3,224,224] 3.893 3.858 3.876 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.855 1.697 1.161
model/bert-base-uncased f16, bs=1, seq=128 1.710 0.714 0.799 0.862

Time: 2.25 hours

yaoyaoding commented 1 year ago

2023-07-26

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.880 1.847 1.861 0.728
resnet50 f16[1,3,224,224] 3.905 3.745 3.656 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.859 1.696 1.149
model/bert-base-uncased f16, bs=1, seq=128 1.732 0.709 0.796 0.864

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-27

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.952 1.852 1.859 0.737
resnet50 f16[1,3,224,224] 3.908 3.850 3.686 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.865 1.699 1.265
model/bert-base-uncased f16, bs=1, seq=128 1.730 0.713 0.800 0.907

Time: 2.25 hours

yaoyaoding commented 1 year ago

2023-07-28

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.947 1.854 1.858 0.729
resnet50 f16[1,3,224,224] 4.162 3.862 3.768 0.525
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.866 1.701 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.727 0.710 0.796 0.853

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-29

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.940 1.851 1.859 0.725
resnet50 f16[1,3,224,224] 3.905 3.580 3.743 0.521
model/bert-base-uncased f32, bs=1, seq=128 2.035 1.864 1.706 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.742 0.713 0.798 0.913

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-30

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.953 1.855 1.858 0.735
resnet50 f16[1,3,224,224] 3.931 3.604 3.583 0.519
model/bert-base-uncased f32, bs=1, seq=128 2.048 1.865 1.633 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.743 0.709 0.795 0.857

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-07-31

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.913 1.805 1.870 0.730
resnet50 f16[1,3,224,224] 3.900 3.782 3.674 0.520
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.753 1.699 1.148
model/bert-base-uncased f16, bs=1, seq=128 1.730 0.709 0.795 0.916

Time: 2.26 hours