Closed tcgu-amd closed 5 days ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.17%. Comparing base (
f5df004
) to head (0e4eb3d
). Report is 2 commits behind head on develop.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Test | Batch | Rate new 0e4eb3 |
Rate old f5df00 |
Diff | Compare | |
---|---|---|---|---|---|---|
torchvision-resnet50 | 64 | 3,257.79 | 3,256.17 | 0.05% | :white_check_mark: | |
torchvision-resnet50_fp16 | 64 | 6,981.39 | 6,985.11 | -0.05% | :white_check_mark: | |
torchvision-densenet121 | 32 | 2,434.89 | 2,435.96 | -0.04% | :white_check_mark: | |
torchvision-densenet121_fp16 | 32 | 4,090.78 | 4,058.51 | 0.80% | :white_check_mark: | |
torchvision-inceptionv3 | 32 | 1,638.23 | 1,636.64 | 0.10% | :white_check_mark: | |
torchvision-inceptionv3_fp16 | 32 | 2,761.72 | 2,762.67 | -0.03% | :white_check_mark: | |
cadene-inceptionv4 | 16 | 777.06 | 776.18 | 0.11% | :white_check_mark: | |
cadene-resnext64x4 | 16 | 810.41 | 811.76 | -0.17% | :white_check_mark: | |
slim-mobilenet | 64 | 7,531.07 | 7,534.07 | -0.04% | :white_check_mark: | |
slim-nasnetalarge | 64 | 211.38 | 211.45 | -0.03% | :white_check_mark: | |
slim-resnet50v2 | 64 | 3,502.51 | 3,504.24 | -0.05% | :white_check_mark: | |
bert-mrpc-onnx | 8 | 1,151.97 | 1,149.67 | 0.20% | :white_check_mark: | |
bert-mrpc-tf | 1 | 471.57 | 463.86 | 1.66% | :white_check_mark: | |
pytorch-examples-wlang-gru | 1 | 423.43 | 420.46 | 0.70% | :white_check_mark: | |
pytorch-examples-wlang-lstm | 1 | 382.69 | 381.56 | 0.30% | :white_check_mark: | |
torchvision-resnet50_1 | 1 | 802.95 | 780.85 | 2.83% | :white_check_mark: | |
cadene-dpn92_1 | 1 | 401.20 | 405.55 | -1.07% | :white_check_mark: | |
cadene-resnext101_1 | 1 | 384.59 | 383.55 | 0.27% | :white_check_mark: | |
onnx-taau-downsample | 1 | 343.27 | 343.07 | 0.06% | :white_check_mark: | |
dlrm-criteoterabyte | 1 | 33.33 | 33.34 | -0.05% | :white_check_mark: | |
dlrm-criteoterabyte_fp16 | 1 | 52.71 | 52.74 | -0.06% | :white_check_mark: | |
agentmodel | 1 | 9,998.60 | 8,306.27 | 20.37% | :high_brightness: | |
unet_fp16 | 2 | 58.89 | 58.82 | 0.12% | :white_check_mark: | |
resnet50v1_fp16 | 1 | 1,002.99 | 1,001.66 | 0.13% | :white_check_mark: | |
resnet50v1_int8 | 1 | 1,011.13 | 995.76 | 1.54% | :white_check_mark: | |
bert_base_cased_fp16 | 64 | 1,171.70 | 1,171.04 | 0.06% | :white_check_mark: | |
bert_large_uncased_fp16 | 32 | 363.41 | 363.62 | -0.06% | :white_check_mark: | |
bert_large_fp16 | 1 | 200.31 | 198.87 | 0.72% | :white_check_mark: | |
distilgpt2_fp16 | 16 | 2,205.45 | 2,204.83 | 0.03% | :white_check_mark: | |
yolov5s | 1 | 531.88 | 540.84 | -1.66% | :white_check_mark: | |
tinyllama | 1 | 43.45 | 43.47 | -0.05% | :white_check_mark: | |
vicuna-fastchat | 1 | 176.36 | 176.64 | -0.16% | :white_check_mark: | |
whisper-tiny-encoder | 1 | 418.01 | 418.46 | -0.11% | :white_check_mark: | |
whisper-tiny-decoder | 1 | 428.02 | 433.85 | -1.34% | :white_check_mark: |
Check results before merge :high_brightness:
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output
Fixing issues in PR https://github.com/ROCm/AMDMIGraphX/pull/3553 that is causing failures in some of the tests.
Description of 3553:
The previous fix with BFS doesn't fully work in more complex cases (e.g. it will fail in the newly added test case
check_replace_dag
). This fix implements topological sorting to replace instructions in topological order which should work for all cases.More details:
In a dummy scenario of
add2(reduce(x), add1(abs(reduce(x)), sin(reduce(x))))
, we will have a dependency tree looking likeIf we call reduce.replace(), BFS will visit the instructions in the following order:
This will causes an error of shape mismatch at
add2
because it is called before its inputadd1
.Topological sorting the instruction tree will yield:
Which is the correct order to process the instructions.
This should be able to extend to more complex cases.