Open yaoyaoding opened 1 year ago
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.223 | 1.069 | 1.107 | 0.734 |
resnet50 | f16[1,3,224,224] | 1.408 | 1.092 | 1.112 | 0.482 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.025 | 1.863 | 1.698 | 1.164 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.951 | 0.712 | 0.798 | 0.894 |
Time: 2.18 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.247 | 1.066 | 1.077 | 0.729 |
resnet50 | f16[1,3,224,224] | 1.442 | 1.096 | 1.109 | 0.476 |
model/bert-base-uncased | f32, bs=1, seq=128 | 1.958 | 1.867 | 1.700 | 1.163 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.927 | 0.713 | 0.799 | 0.891 |
Time: 2.19 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.245 | 1.103 | 1.081 | 0.738 |
resnet50 | f16[1,3,224,224] | 1.410 | 1.079 | 1.078 | 0.477 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.054 | 1.865 | 1.704 | 1.160 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.932 | 0.713 | 0.800 | 0.895 |
Time: 2.19 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.230 | 1.088 | 1.078 | 0.731 |
resnet50 | f16[1,3,224,224] | 1.414 | 1.093 | 1.103 | 0.475 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.045 | 1.867 | 1.697 | 1.165 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.904 | 0.711 | 0.797 | 0.891 |
Time: 2.18 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.211 | 1.062 | 1.092 | 0.732 |
resnet50 | f16[1,3,224,224] | 1.420 | 1.095 | 1.115 | 0.477 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.057 | 1.863 | 1.682 | 1.289 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.934 | 0.711 | 0.798 | 0.896 |
Time: 2.16 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.219 | 1.117 | 1.087 | 0.732 |
resnet50 | f16[1,3,224,224] | 1.421 | 1.077 | 1.095 | 0.526 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.053 | 1.862 | 1.696 | 1.165 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.909 | 0.712 | 0.799 | 0.886 |
Time: 2.24 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.221 | 1.055 | 1.076 | 0.735 |
resnet50 | f16[1,3,224,224] | 1.428 | 1.084 | 1.096 | 0.526 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.052 | 1.870 | 1.705 | 1.162 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.894 | 0.712 | 0.799 | 0.883 |
Time: 2.25 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.227 | 1.044 | 1.084 | 0.737 |
resnet50 | f16[1,3,224,224] | 1.407 | 1.082 | 1.101 | 0.528 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.054 | 1.867 | 1.699 | 1.285 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.913 | 0.713 | 0.799 | 0.885 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.212 | 1.057 | 1.101 | 0.739 |
resnet50 | f16[1,3,224,224] | 1.432 | 1.089 | 1.110 | 0.523 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.054 | 1.863 | 1.707 | 1.277 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.891 | 0.712 | 0.798 | 0.880 |
Time: 2.27 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.227 | 1.076 | 1.110 | 0.734 |
resnet50 | f16[1,3,224,224] | 1.401 | 1.080 | 1.087 | 0.527 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.051 | 1.773 | 1.688 | 1.164 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.921 | 0.715 | 0.799 | 0.881 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.236 | 1.065 | 1.088 | 0.736 |
resnet50 | f16[1,3,224,224] | 1.398 | 1.092 | 1.085 | 0.524 |
model/bert-base-uncased | f32, bs=1, seq=128 | 1.978 | 1.789 | 1.701 | nan |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.899 | 0.711 | 0.798 | nan |
Time: 1.97 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.233 | 1.064 | 1.071 | 0.733 |
resnet50 | f16[1,3,224,224] | 1.409 | 1.067 | 1.094 | 0.529 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.040 | 1.856 | 1.695 | 1.167 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.924 | 0.710 | 0.797 | 0.883 |
Time: 2.28 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.220 | 1.076 | 1.099 | 0.733 |
resnet50 | f16[1,3,224,224] | 1.420 | 1.088 | 1.114 | 0.526 |
model/bert-base-uncased | f32, bs=1, seq=128 | 1.972 | 1.868 | 1.702 | 1.165 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.862 | 0.715 | 0.801 | 0.882 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.221 | 1.065 | 1.083 | 0.733 |
resnet50 | f16[1,3,224,224] | 1.429 | 1.099 | 1.108 | 0.525 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.052 | 1.869 | 1.704 | 1.293 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.906 | 0.710 | 0.797 | 0.938 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.232 | 1.061 | 1.077 | 0.728 |
resnet50 | f16[1,3,224,224] | 1.398 | 1.097 | 1.099 | 0.519 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.055 | 1.865 | 1.698 | 1.189 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.893 | 0.712 | 0.798 | 0.884 |
Time: 2.25 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.225 | 1.057 | 1.076 | 0.730 |
resnet50 | f16[1,3,224,224] | 1.397 | 1.086 | 1.107 | 0.521 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.042 | 1.865 | 1.699 | 1.284 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.978 | 0.712 | 0.796 | 0.883 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.250 | 1.081 | 1.097 | 0.731 |
resnet50 | f16[1,3,224,224] | 1.412 | 1.088 | 1.114 | 0.520 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.049 | 1.803 | 1.699 | 1.180 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.936 | 0.712 | 0.799 | 0.881 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.231 | 1.042 | 1.075 | 0.730 |
resnet50 | f16[1,3,224,224] | 1.393 | 1.082 | 1.091 | 0.522 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.051 | 1.866 | 1.703 | 1.159 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.909 | 0.710 | 0.797 | 0.882 |
Time: 2.25 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.947 | 1.832 | 1.865 | 0.729 |
resnet50 | f16[1,3,224,224] | 4.156 | 3.690 | 3.621 | 0.524 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.056 | 1.866 | 1.700 | 1.161 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.732 | 0.713 | 0.800 | 0.882 |
Time: 2.46 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.944 | 1.857 | 1.864 | 0.734 |
resnet50 | f16[1,3,224,224] | 3.949 | 3.638 | 3.648 | 0.518 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.041 | 1.865 | 1.704 | 1.165 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.733 | 0.712 | 0.799 | 0.880 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.949 | 1.852 | 1.857 | 0.732 |
resnet50 | f16[1,3,224,224] | 4.102 | 3.609 | 3.876 | 0.521 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.050 | 1.815 | 1.707 | 1.149 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.718 | 0.714 | 0.801 | 0.855 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.967 | 1.858 | 1.868 | 0.728 |
resnet50 | f16[1,3,224,224] | 4.233 | 3.880 | 3.755 | 0.521 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.038 | 1.865 | 1.673 | 1.150 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.743 | 0.712 | 0.796 | 0.855 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.918 | 1.852 | 1.863 | 0.736 |
resnet50 | f16[1,3,224,224] | 3.973 | 3.766 | 3.647 | 0.522 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.057 | 1.865 | 1.701 | 1.149 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.753 | 0.710 | 0.796 | 0.859 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.952 | 1.865 | 1.865 | 0.731 |
resnet50 | f16[1,3,224,224] | 3.967 | 3.760 | 3.638 | 0.524 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.057 | 1.838 | 1.698 | 1.142 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.725 | 0.712 | 0.801 | 0.865 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.943 | 1.854 | 1.866 | 0.728 |
resnet50 | f16[1,3,224,224] | 4.128 | 3.874 | 3.752 | 0.520 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.055 | 1.829 | 1.683 | 1.151 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.726 | 0.712 | 0.797 | 0.849 |
Time: 2.44 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.937 | 1.853 | 1.860 | 0.730 |
resnet50 | f16[1,3,224,224] | 3.972 | 3.765 | 3.752 | 0.521 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.006 | 1.867 | 1.701 | 1.146 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.733 | 0.713 | 0.800 | 0.856 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.952 | 1.853 | 1.863 | 0.728 |
resnet50 | f16[1,3,224,224] | 3.934 | 3.743 | 3.658 | 0.524 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.050 | 1.840 | 1.701 | 1.145 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.729 | 0.711 | 0.799 | 0.858 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.926 | 1.854 | 1.807 | 0.728 |
resnet50 | f16[1,3,224,224] | 3.893 | 3.858 | 3.876 | 0.519 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.049 | 1.855 | 1.697 | 1.161 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.710 | 0.714 | 0.799 | 0.862 |
Time: 2.25 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.880 | 1.847 | 1.861 | 0.728 |
resnet50 | f16[1,3,224,224] | 3.905 | 3.745 | 3.656 | 0.519 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.051 | 1.859 | 1.696 | 1.149 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.732 | 0.709 | 0.796 | 0.864 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.952 | 1.852 | 1.859 | 0.737 |
resnet50 | f16[1,3,224,224] | 3.908 | 3.850 | 3.686 | 0.520 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.049 | 1.865 | 1.699 | 1.265 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.730 | 0.713 | 0.800 | 0.907 |
Time: 2.25 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.947 | 1.854 | 1.858 | 0.729 |
resnet50 | f16[1,3,224,224] | 4.162 | 3.862 | 3.768 | 0.525 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.050 | 1.866 | 1.701 | 1.148 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.727 | 0.710 | 0.796 | 0.853 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.940 | 1.851 | 1.859 | 0.725 |
resnet50 | f16[1,3,224,224] | 3.905 | 3.580 | 3.743 | 0.521 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.035 | 1.864 | 1.706 | 1.148 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.742 | 0.713 | 0.798 | 0.913 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.953 | 1.855 | 1.858 | 0.735 |
resnet50 | f16[1,3,224,224] | 3.931 | 3.604 | 3.583 | 0.519 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.048 | 1.865 | 1.633 | 1.148 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.743 | 0.709 | 0.795 | 0.857 |
Time: 2.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.913 | 1.805 | 1.870 | 0.730 |
resnet50 | f16[1,3,224,224] | 3.900 | 3.782 | 3.674 | 0.520 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.051 | 1.753 | 1.699 | 1.148 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.730 | 0.709 | 0.795 | 0.916 |
Time: 2.26 hours
This issue tracks the performance benchmarks of hidet vs. other dynamo backends in pytorch.
The benchmark scripts that produce these report are located at hidet/scripts/bench.