hidet-org / hidet

An open-source efficient deep learning framework/compiler, written in python.
https://hidet.org
Apache License 2.0
645 stars 52 forks source link

[Tracking Issue] Benchmarks #154

Open yaoyaoding opened 1 year ago

yaoyaoding commented 1 year ago

This issue tracks the performance benchmarks of hidet vs. other dynamo backends in pytorch.

The benchmark scripts that produce these report are located at hidet/scripts/bench.

yaoyaoding commented 1 year ago

2023-05-06

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.693 3.271 3.315 1.507
resnet50 f16[1,3,224,224] 5.555 3.422 3.447 1.140
model/bert-base-uncased f32, bs=1, seq=128 6.170 3.082 2.898 2.335
model/bert-base-uncased f16, bs=1, seq=128 6.266 1.427 1.096 1.812

Time: 7.36 hours

yaoyaoding commented 1 year ago

2023-05-07

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.551 1.423 1.413 1.309
resnet50 f16[1,3,224,224] 1.725 1.337 1.391 1.006
model/bert-base-uncased f32, bs=1, seq=128 3.013 2.727 2.673 1.988
model/bert-base-uncased f16, bs=1, seq=128 1.989 1.299 1.048 1.551

Time: 2.16 hours

yaoyaoding commented 1 year ago

2023-05-07

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.686 3.293 3.312 1.506
resnet50 f16[1,3,224,224] 5.593 3.453 3.504 1.145
model/bert-base-uncased f32, bs=1, seq=128 6.155 3.082 2.898 2.334
model/bert-base-uncased f16, bs=1, seq=128 6.250 1.426 1.096 1.816

Time: 7.37 hours

yaoyaoding commented 1 year ago

2023-05-08

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.546 1.418 1.413 1.309
resnet50 f16[1,3,224,224] 1.742 1.327 1.366 1.006
model/bert-base-uncased f32, bs=1, seq=128 3.015 2.721 2.677 1.979
model/bert-base-uncased f16, bs=1, seq=128 1.984 1.298 1.046 1.541

Time: 2.15 hours

yaoyaoding commented 1 year ago

2023-05-08

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.690 3.300 3.345 1.508
resnet50 f16[1,3,224,224] 5.573 3.403 3.462 1.146
model/bert-base-uncased f32, bs=1, seq=128 6.110 3.084 2.898 2.335
model/bert-base-uncased f16, bs=1, seq=128 6.390 1.426 1.098 1.817

Time: 7.36 hours

yaoyaoding commented 1 year ago

2023-05-09

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.552 1.426 1.419 1.329
resnet50 f16[1,3,224,224] 1.828 1.405 1.440 1.012
model/bert-base-uncased f32, bs=1, seq=128 3.097 2.857 2.675 1.989
model/bert-base-uncased f16, bs=1, seq=128 2.092 1.298 1.047 1.551

Time: 2.29 hours

yaoyaoding commented 1 year ago

2023-05-09

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.676 3.263 3.297 1.509
resnet50 f16[1,3,224,224] 5.515 3.375 3.416 1.150
model/bert-base-uncased f32, bs=1, seq=128 6.127 3.082 2.896 2.336
model/bert-base-uncased f16, bs=1, seq=128 6.263 1.426 1.098 1.811

Time: 7.34 hours

yaoyaoding commented 1 year ago

2023-05-10

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.546 1.419 1.410 1.374
resnet50 f16[1,3,224,224] 1.814 1.404 1.613 1.012
model/bert-base-uncased f32, bs=1, seq=128 3.089 2.707 2.563 2.075
model/bert-base-uncased f16, bs=1, seq=128 2.090 1.300 1.047 1.551

Time: 2.78 hours

yaoyaoding commented 1 year ago

2023-05-10

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.649 3.284 3.326 1.568
resnet50 f16[1,3,224,224] 5.637 3.424 3.465 1.155
model/bert-base-uncased f32, bs=1, seq=128 6.180 3.081 2.896 2.407
model/bert-base-uncased f16, bs=1, seq=128 6.355 1.425 1.096 1.812

Time: 9.10 hours

yaoyaoding commented 1 year ago

2023-05-11

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.554 1.418 1.499 1.375
resnet50 f16[1,3,224,224] 1.805 1.418 1.456 1.044
model/bert-base-uncased f32, bs=1, seq=128 3.085 2.852 2.560 2.100
model/bert-base-uncased f16, bs=1, seq=128 2.074 1.301 1.049 1.554

Time: 2.78 hours

yaoyaoding commented 1 year ago

2023-05-11

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.698 3.272 3.309 1.569
resnet50 f16[1,3,224,224] 5.657 3.446 3.498 1.155
model/bert-base-uncased f32, bs=1, seq=128 6.232 3.084 2.898 2.404
model/bert-base-uncased f16, bs=1, seq=128 6.422 1.426 1.097 1.827

Time: 9.12 hours

yaoyaoding commented 1 year ago

2023-05-12

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.548 1.417 1.411 1.374
resnet50 f16[1,3,224,224] 1.805 1.646 1.452 1.009
model/bert-base-uncased f32, bs=1, seq=128 2.955 2.857 2.681 2.073
model/bert-base-uncased f16, bs=1, seq=128 2.088 1.299 1.048 1.539

Time: 2.83 hours

yaoyaoding commented 1 year ago

2023-05-12

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.724 3.302 3.378 1.570
resnet50 f16[1,3,224,224] 5.570 3.464 3.495 1.140
model/bert-base-uncased f32, bs=1, seq=128 6.191 3.087 2.903 2.400
model/bert-base-uncased f16, bs=1, seq=128 6.410 1.426 1.096 1.823

Time: 9.08 hours

yaoyaoding commented 1 year ago

2023-05-13

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.552 1.418 1.413 1.374
resnet50 f16[1,3,224,224] 1.825 1.426 1.479 1.011
model/bert-base-uncased f32, bs=1, seq=128 3.098 2.862 2.573 2.081
model/bert-base-uncased f16, bs=1, seq=128 2.096 1.300 1.049 1.543

Time: 2.84 hours

yaoyaoding commented 1 year ago

2023-05-13

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.629 3.284 3.329 1.565
resnet50 f16[1,3,224,224] 5.548 3.437 3.461 1.141
model/bert-base-uncased f32, bs=1, seq=128 6.112 3.085 2.904 2.400
model/bert-base-uncased f16, bs=1, seq=128 6.283 1.426 1.097 1.824

Time: 9.04 hours

yaoyaoding commented 1 year ago

2023-05-14

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.550 1.418 1.414 1.372
resnet50 f16[1,3,224,224] 1.800 1.546 1.441 1.015
model/bert-base-uncased f32, bs=1, seq=128 2.974 2.710 2.684 2.073
model/bert-base-uncased f16, bs=1, seq=128 2.117 1.298 1.046 1.541

Time: 2.83 hours

yaoyaoding commented 1 year ago

2023-05-14

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.672 3.240 3.320 1.568
resnet50 f16[1,3,224,224] 5.716 3.492 3.524 1.140
model/bert-base-uncased f32, bs=1, seq=128 6.138 3.082 2.899 2.400
model/bert-base-uncased f16, bs=1, seq=128 6.344 1.426 1.097 1.824

Time: 9.09 hours

yaoyaoding commented 1 year ago

2023-05-15

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.551 1.421 1.413 1.376
resnet50 f16[1,3,224,224] 1.825 1.412 1.430 1.007
model/bert-base-uncased f32, bs=1, seq=128 3.091 2.762 2.679 2.077
model/bert-base-uncased f16, bs=1, seq=128 2.094 1.299 1.047 1.544

Time: 2.79 hours

yaoyaoding commented 1 year ago

2023-05-15

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 4.725 3.309 3.350 1.570
resnet50 f16[1,3,224,224] 5.526 3.458 3.475 1.155
model/bert-base-uncased f32, bs=1, seq=128 6.104 3.094 2.900 2.413
model/bert-base-uncased f16, bs=1, seq=128 6.352 1.431 1.098 1.867

Time: 9.26 hours

yaoyaoding commented 1 year ago

2023-05-31

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.244 1.066 1.090 1.053
resnet50 f16[1,3,224,224] 1.475 1.104 1.134 0.647
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.864 1.576 1.196
model/bert-base-uncased f16, bs=1, seq=128 1.791 0.711 0.738 0.957

Time: 2.32 hours

yaoyaoding commented 1 year ago

2023-06-01

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.230 1.061 1.085 1.066
resnet50 f16[1,3,224,224] 1.447 1.057 1.099 0.641
model/bert-base-uncased f32, bs=1, seq=128 2.049 1.866 1.670 1.219
model/bert-base-uncased f16, bs=1, seq=128 1.769 0.714 0.801 0.958

Time: 2.26 hours

yaoyaoding commented 1 year ago

2023-06-02

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.939 1.852 1.864 0.869
resnet50 f16[1,3,224,224] 3.917 3.642 3.887 0.641
model/bert-base-uncased f32, bs=1, seq=128 2.046 1.749 1.691 1.280
model/bert-base-uncased f16, bs=1, seq=128 1.847 0.712 0.800 0.957

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-03

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.941 1.852 1.863 0.819
resnet50 f16[1,3,224,224] 4.252 3.580 3.571 0.623
model/bert-base-uncased f32, bs=1, seq=128 1.965 1.859 1.695 1.276
model/bert-base-uncased f16, bs=1, seq=128 1.828 0.713 0.801 0.939

Time: 2.10 hours

yaoyaoding commented 1 year ago

2023-06-04

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.239 1.064 1.079 0.808
resnet50 f16[1,3,224,224] 1.404 1.111 1.096 0.621
model/bert-base-uncased f32, bs=1, seq=128 1.957 1.828 1.701 1.293
model/bert-base-uncased f16, bs=1, seq=128 1.933 0.714 0.801 0.940

Time: 2.09 hours

yaoyaoding commented 1 year ago

2023-06-05

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.237 1.056 1.080 0.822
resnet50 f16[1,3,224,224] 1.418 1.079 1.091 0.567
model/bert-base-uncased f32, bs=1, seq=128 1.972 1.862 1.701 1.161
model/bert-base-uncased f16, bs=1, seq=128 1.906 0.710 0.800 0.941

Time: 2.14 hours

yaoyaoding commented 1 year ago

2023-06-05

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 6.145 4.233 4.502 1.384
resnet50 f16[1,3,224,224] 7.176 4.303 4.437 1.029
model/bert-base-uncased f32, bs=1, seq=128 9.116 3.619 2.946 2.088
model/bert-base-uncased f16, bs=1, seq=128 9.784 1.239 1.189 1.563

Time: 8.29 hours

yaoyaoding commented 1 year ago

2023-06-06

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.220 1.048 1.097 0.812
resnet50 f16[1,3,224,224] 1.415 1.075 1.095 0.570
model/bert-base-uncased f32, bs=1, seq=128 1.984 1.864 1.697 1.293
model/bert-base-uncased f16, bs=1, seq=128 1.897 0.712 0.800 0.941

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-06

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.802 4.281 4.373 1.384
resnet50 f16[1,3,224,224] 7.292 4.379 4.484 1.025
model/bert-base-uncased f32, bs=1, seq=128 9.202 4.066 3.127 2.161
model/bert-base-uncased f16, bs=1, seq=128 9.881 1.241 1.184 1.598

Time: 8.29 hours

yaoyaoding commented 1 year ago

2023-06-07

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.235 1.055 1.091 0.799
resnet50 f16[1,3,224,224] 1.407 1.084 1.095 0.564
model/bert-base-uncased f32, bs=1, seq=128 2.046 1.859 1.699 1.163
model/bert-base-uncased f16, bs=1, seq=128 1.910 0.710 0.796 0.941

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-07

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.943 4.209 4.313 1.384
resnet50 f16[1,3,224,224] 7.108 4.282 4.423 1.025
model/bert-base-uncased f32, bs=1, seq=128 9.152 3.643 3.135 2.130
model/bert-base-uncased f16, bs=1, seq=128 9.861 1.242 1.184 1.571

Time: 8.26 hours

yaoyaoding commented 1 year ago

2023-06-08

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.225 1.059 1.079 0.794
resnet50 f16[1,3,224,224] 1.417 1.092 1.099 0.566
model/bert-base-uncased f32, bs=1, seq=128 2.055 1.870 1.701 1.166
model/bert-base-uncased f16, bs=1, seq=128 1.928 0.715 0.801 0.943

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-08

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.936 4.205 4.322 1.380
resnet50 f16[1,3,224,224] 7.039 4.214 4.362 1.027
model/bert-base-uncased f32, bs=1, seq=128 9.276 3.626 2.942 2.116
model/bert-base-uncased f16, bs=1, seq=128 9.887 1.242 1.183 1.589

Time: 8.25 hours

yaoyaoding commented 1 year ago

2023-06-09

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.232 1.059 1.083 0.793
resnet50 f16[1,3,224,224] 1.422 1.076 1.101 0.565
model/bert-base-uncased f32, bs=1, seq=128 2.048 1.869 1.705 1.169
model/bert-base-uncased f16, bs=1, seq=128 1.920 0.710 0.795 0.894

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-09

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.883 4.250 4.434 1.386
resnet50 f16[1,3,224,224] 7.112 4.319 4.485 1.029
model/bert-base-uncased f32, bs=1, seq=128 9.146 3.630 3.024 2.117
model/bert-base-uncased f16, bs=1, seq=128 9.886 1.239 1.188 1.578

Time: 8.30 hours

yaoyaoding commented 1 year ago

2023-06-10

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.220 1.050 1.092 0.801
resnet50 f16[1,3,224,224] 1.432 1.078 1.098 0.563
model/bert-base-uncased f32, bs=1, seq=128 2.045 1.869 1.666 1.184
model/bert-base-uncased f16, bs=1, seq=128 1.945 0.712 0.799 0.897

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-10

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.922 4.268 4.383 1.382
resnet50 f16[1,3,224,224] 7.081 4.330 4.420 1.027
model/bert-base-uncased f32, bs=1, seq=128 9.007 4.007 2.756 2.096
model/bert-base-uncased f16, bs=1, seq=128 9.866 1.241 1.186 1.561

Time: 8.30 hours

yaoyaoding commented 1 year ago

2023-06-11

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.218 1.069 1.079 0.733
resnet50 f16[1,3,224,224] 1.434 1.082 1.116 0.563
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.868 1.701 1.164
model/bert-base-uncased f16, bs=1, seq=128 1.894 0.711 0.799 0.893

Time: 2.13 hours

yaoyaoding commented 1 year ago

2023-06-11

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 5.924 4.218 4.355 1.319
resnet50 f16[1,3,224,224] 7.594 4.344 4.442 1.037
model/bert-base-uncased f32, bs=1, seq=128 8.988 3.617 2.727 2.084
model/bert-base-uncased f16, bs=1, seq=128 9.930 1.239 1.184 1.565

Time: 8.16 hours

yaoyaoding commented 1 year ago

2023-06-12

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.249 1.071 1.094 0.734
resnet50 f16[1,3,224,224] 1.434 1.106 1.131 0.565
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.865 1.702 1.165
model/bert-base-uncased f16, bs=1, seq=128 1.908 0.712 0.801 0.900

Time: 2.12 hours

yaoyaoding commented 1 year ago

2023-06-13

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.205 1.051 1.074 0.729
resnet50 f16[1,3,224,224] 1.440 1.081 1.124 0.568
model/bert-base-uncased f32, bs=1, seq=128 1.930 1.866 1.698 1.182
model/bert-base-uncased f16, bs=1, seq=128 1.924 0.710 0.798 0.890

Time: 2.12 hours

yaoyaoding commented 1 year ago

2023-06-14

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.225 1.043 1.077 0.762
resnet50 f16[1,3,224,224] 1.414 1.080 1.094 0.568
model/bert-base-uncased f32, bs=1, seq=128 2.053 1.867 1.698 1.290
model/bert-base-uncased f16, bs=1, seq=128 1.908 0.711 0.797 0.936

Time: 2.11 hours

yaoyaoding commented 1 year ago

2023-06-15

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.208 1.052 1.072 0.770
resnet50 f16[1,3,224,224] 1.416 1.087 1.137 0.566
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.868 1.650 1.283
model/bert-base-uncased f16, bs=1, seq=128 1.918 0.712 0.799 0.944

Time: 2.11 hours

yaoyaoding commented 1 year ago

2023-06-16

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.224 1.054 1.076 0.765
resnet50 f16[1,3,224,224] 1.427 1.102 1.109 0.567
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.864 1.703 1.293
model/bert-base-uncased f16, bs=1, seq=128 1.919 0.711 0.797 0.896

Time: 2.11 hours

yaoyaoding commented 1 year ago

2023-06-17

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.220 1.077 1.069 0.762
resnet50 f16[1,3,224,224] 1.402 1.085 1.097 0.661
model/bert-base-uncased f32, bs=1, seq=128 2.039 1.856 1.633 1.291
model/bert-base-uncased f16, bs=1, seq=128 1.871 0.712 0.800 0.897

Time: 3.41 hours

yaoyaoding commented 1 year ago

2023-06-18

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.216 1.049 1.065 0.762
resnet50 f16[1,3,224,224] 1.410 1.080 1.089 0.659
model/bert-base-uncased f32, bs=1, seq=128 2.051 1.865 1.705 1.167
model/bert-base-uncased f16, bs=1, seq=128 1.894 0.714 0.801 0.894

Time: 3.42 hours

yaoyaoding commented 1 year ago

2023-06-19

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.211 1.059 1.074 0.766
resnet50 f16[1,3,224,224] 1.399 1.066 1.100 0.661
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.866 1.623 1.304
model/bert-base-uncased f16, bs=1, seq=128 1.883 0.710 0.796 0.941

Time: 3.42 hours

yaoyaoding commented 1 year ago

2023-06-20

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.229 1.062 1.091 0.767
resnet50 f16[1,3,224,224] 1.412 1.091 1.106 0.479
model/bert-base-uncased f32, bs=1, seq=128 1.972 1.866 1.701 1.166
model/bert-base-uncased f16, bs=1, seq=128 1.950 0.711 0.798 0.943

Time: 2.17 hours

yaoyaoding commented 1 year ago

2023-06-21

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.253 1.063 1.081 0.733
resnet50 f16[1,3,224,224] 1.424 1.080 1.110 0.478
model/bert-base-uncased f32, bs=1, seq=128 2.052 1.830 1.702 1.166
model/bert-base-uncased f16, bs=1, seq=128 1.902 0.712 0.799 0.901

Time: 2.17 hours

yaoyaoding commented 1 year ago

2023-06-22

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.217 1.059 1.074 0.730
resnet50 f16[1,3,224,224] 1.405 1.089 1.101 0.477
model/bert-base-uncased f32, bs=1, seq=128 2.050 1.870 1.699 1.164
model/bert-base-uncased f16, bs=1, seq=128 1.917 0.711 0.799 0.894

Time: 2.17 hours

yaoyaoding commented 1 year ago

2023-06-23

model inputs eager reduce-overhead max-autotune hidet(2)
resnet50 f32[1,3,224,224] 1.240 1.055 1.084 0.737
resnet50 f16[1,3,224,224] 1.415 1.099 1.115 0.475
model/bert-base-uncased f32, bs=1, seq=128 2.022 1.867 1.677 1.167
model/bert-base-uncased f16, bs=1, seq=128 1.897 0.713 0.796 0.892

Time: 2.17 hours