ROCm / AMDMIGraphX

AMD's graph optimization engine.
https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/
MIT License
185 stars 86 forks source link

Fixed instruction::replace() logic - Fix CI. #3574

Closed tcgu-amd closed 5 days ago

tcgu-amd commented 2 weeks ago

Fixing issues in PR https://github.com/ROCm/AMDMIGraphX/pull/3553 that is causing failures in some of the tests.

Description of 3553:

The previous fix with BFS doesn't fully work in more complex cases (e.g. it will fail in the newly added test case check_replace_dag). This fix implements topological sorting to replace instructions in topological order which should work for all cases.

More details:

In a dummy scenario of add2(reduce(x), add1(abs(reduce(x)), sin(reduce(x)))), we will have a dependency tree looking like

reduce _
        \_abs__
         \_sin__\_add1_
          \_____________\_add2

If we call reduce.replace(), BFS will visit the instructions in the following order:

reduce -> abs -> sin -> add2 -> add1

This will causes an error of shape mismatch at add2 because it is called before its input add1.

Topological sorting the instruction tree will yield:

reduce -> sin -> abs -> add1 -> add2

Which is the correct order to process the instructions.

This should be able to extend to more complex cases.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.17%. Comparing base (f5df004) to head (0e4eb3d). Report is 2 commits behind head on develop.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #3574 +/- ## ======================================== Coverage 92.17% 92.17% ======================================== Files 513 513 Lines 21536 21547 +11 ======================================== + Hits 19851 19862 +11 Misses 1685 1685 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

migraphx-bot commented 5 days ago
Test Batch Rate new
0e4eb3
Rate old
f5df00
Diff Compare
torchvision-resnet50 64 3,257.79 3,256.17 0.05% :white_check_mark:
torchvision-resnet50_fp16 64 6,981.39 6,985.11 -0.05% :white_check_mark:
torchvision-densenet121 32 2,434.89 2,435.96 -0.04% :white_check_mark:
torchvision-densenet121_fp16 32 4,090.78 4,058.51 0.80% :white_check_mark:
torchvision-inceptionv3 32 1,638.23 1,636.64 0.10% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,761.72 2,762.67 -0.03% :white_check_mark:
cadene-inceptionv4 16 777.06 776.18 0.11% :white_check_mark:
cadene-resnext64x4 16 810.41 811.76 -0.17% :white_check_mark:
slim-mobilenet 64 7,531.07 7,534.07 -0.04% :white_check_mark:
slim-nasnetalarge 64 211.38 211.45 -0.03% :white_check_mark:
slim-resnet50v2 64 3,502.51 3,504.24 -0.05% :white_check_mark:
bert-mrpc-onnx 8 1,151.97 1,149.67 0.20% :white_check_mark:
bert-mrpc-tf 1 471.57 463.86 1.66% :white_check_mark:
pytorch-examples-wlang-gru 1 423.43 420.46 0.70% :white_check_mark:
pytorch-examples-wlang-lstm 1 382.69 381.56 0.30% :white_check_mark:
torchvision-resnet50_1 1 802.95 780.85 2.83% :white_check_mark:
cadene-dpn92_1 1 401.20 405.55 -1.07% :white_check_mark:
cadene-resnext101_1 1 384.59 383.55 0.27% :white_check_mark:
onnx-taau-downsample 1 343.27 343.07 0.06% :white_check_mark:
dlrm-criteoterabyte 1 33.33 33.34 -0.05% :white_check_mark:
dlrm-criteoterabyte_fp16 1 52.71 52.74 -0.06% :white_check_mark:
agentmodel 1 9,998.60 8,306.27 20.37% :high_brightness:
unet_fp16 2 58.89 58.82 0.12% :white_check_mark:
resnet50v1_fp16 1 1,002.99 1,001.66 0.13% :white_check_mark:
resnet50v1_int8 1 1,011.13 995.76 1.54% :white_check_mark:
bert_base_cased_fp16 64 1,171.70 1,171.04 0.06% :white_check_mark:
bert_large_uncased_fp16 32 363.41 363.62 -0.06% :white_check_mark:
bert_large_fp16 1 200.31 198.87 0.72% :white_check_mark:
distilgpt2_fp16 16 2,205.45 2,204.83 0.03% :white_check_mark:
yolov5s 1 531.88 540.84 -1.66% :white_check_mark:
tinyllama 1 43.45 43.47 -0.05% :white_check_mark:
vicuna-fastchat 1 176.36 176.64 -0.16% :white_check_mark:
whisper-tiny-encoder 1 418.01 418.46 -0.11% :white_check_mark:
whisper-tiny-decoder 1 428.02 433.85 -1.34% :white_check_mark:

Check results before merge :high_brightness:

migraphx-bot commented 5 days ago


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
     :white_check_mark: unet: PASSED: MIGraphX meets tolerance
     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance