Fixed instruction::replace() logic - Fix CI.

tcgu-amd commented 2 weeks ago

Fixing issues in PR https://github.com/ROCm/AMDMIGraphX/pull/3553 that is causing failures in some of the tests.

Description of 3553:

The previous fix with BFS doesn't fully work in more complex cases (e.g. it will fail in the newly added test case check_replace_dag). This fix implements topological sorting to replace instructions in topological order which should work for all cases.

More details:

In a dummy scenario of add2(reduce(x), add1(abs(reduce(x)), sin(reduce(x)))), we will have a dependency tree looking like

reduce _
        \_abs__
         \_sin__\_add1_
          \_____________\_add2

If we call reduce.replace(), BFS will visit the instructions in the following order:

reduce -> abs -> sin -> add2 -> add1

This will causes an error of shape mismatch at add2 because it is called before its input add1.

Topological sorting the instruction tree will yield:

reduce -> sin -> abs -> add1 -> add2

Which is the correct order to process the instructions.

This should be able to extend to more complex cases.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.17%. Comparing base (f5df004) to head (0e4eb3d). Report is 2 commits behind head on develop.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #3574 +/- ## ======================================== Coverage 92.17% 92.17% ======================================== Files 513 513 Lines 21536 21547 +11 ======================================== + Hits 19851 19862 +11 Misses 1685 1685 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

migraphx-bot commented 5 days ago

Test	Batch	Rate new 0e4eb3	Rate old f5df00	Diff	Compare
torchvision-resnet50	64	3,257.79	3,256.17	0.05%	:white_check_mark:
torchvision-resnet50_fp16	64	6,981.39	6,985.11	-0.05%	:white_check_mark:
torchvision-densenet121	32	2,434.89	2,435.96	-0.04%	:white_check_mark:
torchvision-densenet121_fp16	32	4,090.78	4,058.51	0.80%	:white_check_mark:
torchvision-inceptionv3	32	1,638.23	1,636.64	0.10%	:white_check_mark:
torchvision-inceptionv3_fp16	32	2,761.72	2,762.67	-0.03%	:white_check_mark:
cadene-inceptionv4	16	777.06	776.18	0.11%	:white_check_mark:
cadene-resnext64x4	16	810.41	811.76	-0.17%	:white_check_mark:
slim-mobilenet	64	7,531.07	7,534.07	-0.04%	:white_check_mark:
slim-nasnetalarge	64	211.38	211.45	-0.03%	:white_check_mark:
slim-resnet50v2	64	3,502.51	3,504.24	-0.05%	:white_check_mark:
bert-mrpc-onnx	8	1,151.97	1,149.67	0.20%	:white_check_mark:
bert-mrpc-tf	1	471.57	463.86	1.66%	:white_check_mark:
pytorch-examples-wlang-gru	1	423.43	420.46	0.70%	:white_check_mark:
pytorch-examples-wlang-lstm	1	382.69	381.56	0.30%	:white_check_mark:
torchvision-resnet50_1	1	802.95	780.85	2.83%	:white_check_mark:
cadene-dpn92_1	1	401.20	405.55	-1.07%	:white_check_mark:
cadene-resnext101_1	1	384.59	383.55	0.27%	:white_check_mark:
onnx-taau-downsample	1	343.27	343.07	0.06%	:white_check_mark:
dlrm-criteoterabyte	1	33.33	33.34	-0.05%	:white_check_mark:
dlrm-criteoterabyte_fp16	1	52.71	52.74	-0.06%	:white_check_mark:
agentmodel	1	9,998.60	8,306.27	20.37%	:high_brightness:
unet_fp16	2	58.89	58.82	0.12%	:white_check_mark:
resnet50v1_fp16	1	1,002.99	1,001.66	0.13%	:white_check_mark:
resnet50v1_int8	1	1,011.13	995.76	1.54%	:white_check_mark:
bert_base_cased_fp16	64	1,171.70	1,171.04	0.06%	:white_check_mark:
bert_large_uncased_fp16	32	363.41	363.62	-0.06%	:white_check_mark:
bert_large_fp16	1	200.31	198.87	0.72%	:white_check_mark:
distilgpt2_fp16	16	2,205.45	2,204.83	0.03%	:white_check_mark:
yolov5s	1	531.88	540.84	-1.66%	:white_check_mark:
tinyllama	1	43.45	43.47	-0.05%	:white_check_mark:
vicuna-fastchat	1	176.36	176.64	-0.16%	:white_check_mark:
whisper-tiny-encoder	1	418.01	418.46	-0.11%	:white_check_mark:
whisper-tiny-decoder	1	428.02	433.85	-1.34%	:white_check_mark:

Check results before merge :high_brightness:

migraphx-bot commented 5 days ago

:white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

:white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance

:white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance

:white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark: unet: PASSED: MIGraphX meets tolerance

:white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance

:white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark: bert_large: PASSED: MIGraphX meets tolerance

:white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance

:white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance

:white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

:white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

:white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance

ROCm / AMDMIGraphX

Fixed instruction::replace() logic - Fix CI. #3574

Codecov Report