Open yaoyaoding opened 1 year ago
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.656 | 3.261 | 3.302 | 1.481 |
resnet50 | f16[1,3,224,224] | 5.663 | 3.395 | 3.460 | 1.217 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.060 | 3.095 | 2.920 | 2.335 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.444 | 1.425 | 1.099 | 1.923 |
Time: 3.35 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.542 | 1.408 | 1.407 | 1.291 |
resnet50 | f16[1,3,224,224] | 1.557 | 1.250 | 1.253 | 1.089 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.037 | 2.727 | 2.631 | 2.012 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.865 | 1.288 | 1.017 | 1.715 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.533 | 1.402 | 1.399 | 1.303 |
resnet50 | f16[1,3,224,224] | 1.591 | 1.250 | 1.294 | 1.122 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.033 | 2.807 | 2.634 | 2.008 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.830 | 1.289 | 1.014 | 1.683 |
Time: 2.45 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.537 | 1.404 | 1.399 | 1.293 |
resnet50 | f16[1,3,224,224] | 1.578 | 1.234 | 1.258 | 1.055 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.867 | 2.738 | 2.515 | 2.014 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.863 | 1.287 | 1.014 | 1.688 |
Time: 2.45 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.541 | 1.406 | 1.404 | 1.280 |
resnet50 | f16[1,3,224,224] | 1.577 | 1.243 | 1.288 | 1.116 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.958 | 2.809 | 2.612 | 2.015 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.871 | 1.289 | 1.015 | 1.683 |
Time: 2.18 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.534 | 1.404 | 1.401 | 1.288 |
resnet50 | f16[1,3,224,224] | 1.614 | 1.261 | 1.280 | 1.040 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.911 | 2.811 | 2.631 | 2.031 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.843 | 1.289 | 1.013 | 1.574 |
Time: 2.18 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.584 | 3.263 | 3.316 | 1.512 |
resnet50 | f16[1,3,224,224] | 5.466 | 3.376 | 3.420 | 1.130 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.085 | 3.093 | 2.898 | 2.354 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.288 | 1.425 | 1.095 | 1.863 |
Time: 8.44 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.604 | 3.222 | 3.265 | 1.483 |
resnet50 | f16[1,3,224,224] | 5.401 | 3.365 | 3.412 | 1.141 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.085 | 3.082 | 2.896 | 2.357 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.224 | 1.426 | 1.097 | 1.829 |
Time: 7.91 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.534 | 1.400 | 1.319 | 1.279 |
resnet50 | f16[1,3,224,224] | 1.623 | 1.207 | 1.226 | 1.006 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.941 | 2.716 | 2.593 | 2.006 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.909 | 1.299 | 0.963 | 1.573 |
Time: 2.82 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.651 | 3.271 | 3.282 | 1.509 |
resnet50 | f16[1,3,224,224] | 5.466 | 3.403 | 3.446 | 1.143 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.155 | 3.080 | 2.896 | 2.350 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.249 | 1.426 | 1.097 | 1.831 |
Time: 7.95 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.539 | 1.409 | 1.404 | 1.291 |
resnet50 | f16[1,3,224,224] | 1.580 | 1.175 | 1.200 | 1.008 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.045 | 2.720 | 2.677 | 2.003 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.949 | 1.297 | 1.014 | 1.587 |
Time: 2.68 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.722 | 3.330 | 3.370 | 1.483 |
resnet50 | f16[1,3,224,224] | 5.482 | 3.421 | 3.434 | 1.141 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.154 | 3.085 | 2.901 | 2.346 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.346 | 1.426 | 1.097 | 1.828 |
Time: 8.10 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.532 | 1.401 | 1.396 | 1.289 |
resnet50 | f16[1,3,224,224] | 1.625 | 1.199 | 1.231 | 1.009 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.078 | 2.861 | 2.682 | 2.005 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.959 | 1.299 | 1.013 | 1.592 |
Time: 2.75 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.817 | 3.375 | 3.422 | 1.482 |
resnet50 | f16[1,3,224,224] | 5.511 | 3.415 | 3.452 | 1.142 |
model/bert-base-uncased | f32, bs=1, seq=128 | 7.157 | 3.092 | 2.903 | 2.351 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.398 | 1.426 | 1.098 | 1.829 |
Time: 8.09 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.542 | 1.402 | 1.404 | 1.280 |
resnet50 | f16[1,3,224,224] | 1.610 | 1.203 | 1.215 | 1.009 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.954 | 2.710 | 2.678 | 1.962 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.926 | 1.299 | 1.014 | 1.622 |
Time: 2.74 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.629 | 3.305 | 3.331 | 1.481 |
resnet50 | f16[1,3,224,224] | 5.719 | 3.485 | 3.522 | 1.144 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.184 | 3.084 | 2.897 | 2.337 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.372 | 1.426 | 1.096 | 1.838 |
Time: 8.08 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.543 | 1.410 | 1.406 | 1.281 |
resnet50 | f16[1,3,224,224] | 1.639 | 1.207 | 1.216 | 1.009 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.082 | 2.855 | 2.556 | 2.021 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.943 | 1.298 | 1.015 | 1.610 |
Time: 2.76 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.643 | 3.228 | 3.296 | 1.516 |
resnet50 | f16[1,3,224,224] | 5.600 | 3.431 | 3.448 | 1.146 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.233 | 3.083 | 2.896 | 2.346 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.154 | 1.424 | 1.095 | 1.828 |
Time: 8.54 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.707 | 3.267 | 3.308 | 1.506 |
resnet50 | f16[1,3,224,224] | 5.583 | 3.431 | 3.498 | 1.141 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.188 | 3.084 | 2.898 | 2.339 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.277 | 1.426 | 1.095 | 1.828 |
Time: 8.06 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.517 | 1.390 | 1.293 | 1.272 |
resnet50 | f16[1,3,224,224] | 1.623 | 1.197 | 1.228 | 1.042 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.090 | 2.849 | 2.597 | 2.012 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.922 | 1.297 | 0.990 | 1.572 |
Time: 2.55 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.535 | 1.409 | 1.403 | 1.282 |
resnet50 | f16[1,3,224,224] | 1.609 | 1.160 | 1.192 | 1.042 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.907 | 2.809 | 2.650 | 2.046 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.960 | 1.297 | 1.046 | 1.572 |
Time: 2.50 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.537 | 1.411 | 1.398 | 1.267 |
resnet50 | f16[1,3,224,224] | 1.645 | 1.211 | 1.233 | 1.044 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.097 | 2.859 | 2.675 | 2.006 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.951 | 1.298 | 1.046 | 1.577 |
Time: 2.56 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.780 | 3.355 | 3.410 | 1.506 |
resnet50 | f16[1,3,224,224] | 5.658 | 3.445 | 3.513 | 1.146 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.238 | 3.083 | 2.895 | 2.350 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.477 | 1.426 | 1.095 | 1.829 |
Time: 8.19 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.543 | 1.409 | 1.403 | 1.289 |
resnet50 | f16[1,3,224,224] | 1.620 | 1.192 | 1.218 | 1.045 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.092 | 2.853 | 2.651 | 1.995 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.939 | 1.299 | 1.046 | 1.545 |
Time: 2.61 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.851 | 3.350 | 3.398 | 1.480 |
resnet50 | f16[1,3,224,224] | 5.689 | 3.520 | 3.547 | 1.147 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.228 | 3.082 | 2.898 | 2.334 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.454 | 1.426 | 1.097 | 1.820 |
Time: 8.35 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.530 | 1.403 | 1.398 | 1.312 |
resnet50 | f16[1,3,224,224] | 1.605 | 1.191 | 1.231 | 1.045 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.100 | 2.860 | 2.598 | 2.013 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.926 | 1.298 | 1.047 | 1.546 |
Time: 2.47 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.863 | 3.351 | 3.393 | 1.505 |
resnet50 | f16[1,3,224,224] | 5.575 | 3.427 | 3.452 | 1.148 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.187 | 3.083 | 2.897 | 2.341 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.351 | 1.426 | 1.098 | 1.823 |
Time: 8.26 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.534 | 1.408 | 1.401 | 1.281 |
resnet50 | f16[1,3,224,224] | 1.622 | 1.195 | 1.214 | 1.045 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.929 | 2.711 | 2.684 | 2.030 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.928 | 1.298 | 1.045 | 1.550 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.791 | 3.366 | 3.390 | 1.508 |
resnet50 | f16[1,3,224,224] | 5.684 | 3.526 | 3.549 | 1.148 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.243 | 3.083 | 2.898 | 2.355 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.370 | 1.428 | 1.098 | 1.823 |
Time: 8.40 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.536 | 1.401 | 1.400 | 1.275 |
resnet50 | f16[1,3,224,224] | 1.662 | 1.218 | 1.236 | 1.043 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.043 | 2.828 | 2.604 | 2.037 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.915 | 1.298 | 1.046 | 1.544 |
Time: 2.49 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.791 | 3.338 | 3.374 | 1.479 |
resnet50 | f16[1,3,224,224] | 5.680 | 3.509 | 3.522 | 1.143 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.283 | 3.082 | 2.898 | 2.337 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.449 | 1.426 | 1.097 | 1.828 |
Time: 8.35 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.531 | 1.400 | 1.396 | 1.295 |
resnet50 | f16[1,3,224,224] | 1.635 | 1.194 | 1.228 | 1.046 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.909 | 2.800 | 2.561 | 1.982 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.917 | 1.298 | 1.048 | 1.547 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.768 | 3.347 | 3.379 | 1.501 |
resnet50 | f16[1,3,224,224] | 5.737 | 3.504 | 3.555 | 1.154 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.356 | 3.086 | 2.899 | 2.341 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.370 | 1.426 | 1.096 | 1.820 |
Time: 8.33 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.548 | 1.409 | 1.404 | 1.314 |
resnet50 | f16[1,3,224,224] | 1.616 | 1.200 | 1.222 | 1.046 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.095 | 2.855 | 2.680 | 2.015 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.946 | 1.297 | 1.046 | 1.544 |
Time: 2.48 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.797 | 3.364 | 3.390 | 1.507 |
resnet50 | f16[1,3,224,224] | 5.691 | 3.506 | 3.550 | 1.146 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.270 | 3.083 | 2.900 | 2.340 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.388 | 1.426 | 1.096 | 1.822 |
Time: 8.31 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.536 | 1.412 | 1.405 | 1.298 |
resnet50 | f16[1,3,224,224] | 1.614 | 1.185 | 1.272 | 1.008 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.918 | 2.784 | 2.583 | 1.944 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.960 | 1.297 | 1.045 | 1.616 |
Time: 2.39 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.706 | 3.335 | 3.351 | 1.479 |
resnet50 | f16[1,3,224,224] | 5.636 | 3.436 | 3.463 | 1.147 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.115 | 3.082 | 2.897 | 2.348 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.329 | 1.426 | 1.097 | 1.891 |
Time: 7.73 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.543 | 1.418 | 1.412 | 1.294 |
resnet50 | f16[1,3,224,224] | 1.702 | 1.335 | 1.366 | 1.044 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.916 | 2.854 | 2.672 | 1.964 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.946 | 1.299 | 1.047 | 1.607 |
Time: 2.66 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.709 | 3.314 | 3.365 | 1.477 |
resnet50 | f16[1,3,224,224] | 5.622 | 3.416 | 3.485 | 1.143 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.157 | 3.082 | 2.897 | 2.356 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.333 | 1.425 | 1.099 | 1.896 |
Time: 8.77 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.545 | 1.417 | 1.414 | 1.291 |
resnet50 | f16[1,3,224,224] | 1.710 | 1.331 | 1.383 | 1.041 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.086 | 2.850 | 2.577 | 2.025 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.953 | 1.300 | 1.048 | 1.608 |
Time: 2.65 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.820 | 3.326 | 3.360 | 1.479 |
resnet50 | f16[1,3,224,224] | 5.640 | 3.442 | 3.487 | 1.143 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.292 | 3.085 | 2.901 | 2.366 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.426 | 1.429 | 1.098 | 1.894 |
Time: 8.86 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.552 | 1.423 | 1.419 | 1.288 |
resnet50 | f16[1,3,224,224] | 1.744 | 1.337 | 1.357 | 1.010 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.088 | 2.819 | 2.677 | 2.021 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.948 | 1.300 | 1.046 | 1.605 |
Time: 2.69 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.733 | 3.347 | 3.367 | 1.481 |
resnet50 | f16[1,3,224,224] | 5.506 | 3.400 | 3.450 | 1.144 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.109 | 3.079 | 2.895 | 2.352 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.251 | 1.426 | 1.097 | 1.889 |
Time: 8.77 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.541 | 1.420 | 1.415 | 1.286 |
resnet50 | f16[1,3,224,224] | 1.718 | 1.346 | 1.397 | 1.042 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.084 | 2.792 | 2.668 | 2.032 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.958 | 1.299 | 1.047 | 1.621 |
Time: 2.68 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.667 | 3.340 | 3.375 | 1.479 |
resnet50 | f16[1,3,224,224] | 5.519 | 3.411 | 3.450 | 1.139 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.129 | 3.081 | 2.897 | 2.354 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.299 | 1.426 | 1.098 | 1.889 |
Time: 8.73 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.547 | 1.415 | 1.415 | 1.300 |
resnet50 | f16[1,3,224,224] | 1.702 | 1.332 | 1.366 | 1.042 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.018 | 2.859 | 2.617 | 1.996 |
model/bert-base-uncased | f16, bs=1, seq=128 | 1.971 | 1.298 | 1.046 | 1.610 |
Time: 2.62 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.619 | 3.263 | 3.301 | 1.479 |
resnet50 | f16[1,3,224,224] | 5.596 | 3.412 | 3.487 | 1.148 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.120 | 3.095 | 2.900 | 2.356 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.278 | 1.427 | 1.097 | 1.890 |
Time: 8.99 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.548 | 1.419 | 1.419 | 1.309 |
resnet50 | f16[1,3,224,224] | 1.729 | 1.341 | 1.356 | 1.007 |
model/bert-base-uncased | f32, bs=1, seq=128 | 3.013 | 2.860 | 2.679 | 1.982 |
model/bert-base-uncased | f16, bs=1, seq=128 | 2.002 | 1.299 | 1.046 | 1.551 |
Time: 2.15 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 4.607 | 3.284 | 3.301 | 1.504 |
resnet50 | f16[1,3,224,224] | 5.514 | 3.397 | 3.425 | 1.138 |
model/bert-base-uncased | f32, bs=1, seq=128 | 6.078 | 3.083 | 2.900 | 2.329 |
model/bert-base-uncased | f16, bs=1, seq=128 | 6.322 | 1.426 | 1.097 | 1.812 |
Time: 7.34 hours
model | inputs | eager | reduce-overhead | max-autotune | hidet(2) |
---|---|---|---|---|---|
resnet50 | f32[1,3,224,224] | 1.553 | 1.419 | 1.409 | 1.309 |
resnet50 | f16[1,3,224,224] | 1.718 | 1.345 | 1.362 | 1.039 |
model/bert-base-uncased | f32, bs=1, seq=128 | 2.936 | 2.850 | 2.678 | 2.017 |
model/bert-base-uncased | f16, bs=1, seq=128 | 2.013 | 1.300 | 1.048 | 1.550 |
Time: 2.16 hours
This issue tracks the performance benchmarks of hidet vs. other dynamo backends in pytorch.
The benchmark scripts that produce these report are located at hidet/scripts/bench.