ROCm / AMDMIGraphX

AMD's graph optimization engine.
https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/
MIT License
185 stars 86 forks source link

Migraphx support for gfx12 #3517

Closed aarushjain29 closed 3 weeks ago

aarushjain29 commented 1 month ago

rocBLAS error: Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx1201 List of available TensileLibrary Files : "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1012.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat" "/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat"

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.16%. Comparing base (0fb79b4) to head (49152cb). Report is 4 commits behind head on develop.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #3517 +/- ## ======================================== Coverage 92.16% 92.16% ======================================== Files 512 512 Lines 21401 21401 ======================================== Hits 19724 19724 Misses 1677 1677 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

migraphx-bot commented 3 weeks ago
Test Batch Rate new
49152c
Rate old
b73def
Diff Compare
torchvision-resnet50 64 3,260.90 3,257.93 0.09% :white_check_mark:
torchvision-resnet50_fp16 64 6,986.88 6,992.99 -0.09% :white_check_mark:
torchvision-densenet121 32 2,439.15 2,432.26 0.28% :white_check_mark:
torchvision-densenet121_fp16 32 4,086.78 4,038.39 1.20% :white_check_mark:
torchvision-inceptionv3 32 1,640.38 1,638.89 0.09% :white_check_mark:
torchvision-inceptionv3_fp16 32 2,760.29 2,761.69 -0.05% :white_check_mark:
cadene-inceptionv4 16 776.08 776.39 -0.04% :white_check_mark:
cadene-resnext64x4 16 808.08 811.37 -0.41% :white_check_mark:
slim-mobilenet 64 7,534.77 7,532.73 0.03% :white_check_mark:
slim-nasnetalarge 64 211.52 211.42 0.05% :white_check_mark:
slim-resnet50v2 64 3,506.02 3,507.25 -0.04% :white_check_mark:
bert-mrpc-onnx 8 1,150.45 1,147.76 0.23% :white_check_mark:
bert-mrpc-tf 1 467.70 469.91 -0.47% :white_check_mark:
pytorch-examples-wlang-gru 1 514.97 514.96 0.00% :white_check_mark:
pytorch-examples-wlang-lstm 1 376.05 386.61 -2.73% :white_check_mark:
torchvision-resnet50_1 1 768.67 772.05 -0.44% :white_check_mark:
cadene-dpn92_1 1 397.79 398.73 -0.24% :white_check_mark:
cadene-resnext101_1 1 383.40 383.67 -0.07% :white_check_mark:
onnx-taau-downsample 1 343.17 342.33 0.25% :white_check_mark:
dlrm-criteoterabyte 1 33.36 33.33 0.09% :white_check_mark:
dlrm-criteoterabyte_fp16 1 52.74 52.70 0.08% :white_check_mark:
agentmodel 1 8,478.53 8,056.20 5.24% :high_brightness:
unet_fp16 2 58.94 58.92 0.03% :white_check_mark:
resnet50v1_fp16 1 937.46 950.32 -1.35% :white_check_mark:
resnet50v1_int8 1 1,013.15 1,000.02 1.31% :white_check_mark:
bert_base_cased_fp16 64 1,171.39 1,169.24 0.18% :white_check_mark:
bert_large_uncased_fp16 32 363.89 363.69 0.06% :white_check_mark:
bert_large_fp16 1 200.53 198.89 0.82% :white_check_mark:
distilgpt2_fp16 16 2,202.16 2,203.09 -0.04% :white_check_mark:
yolov5s 1 530.44 540.85 -1.92% :white_check_mark:
tinyllama 1 43.43 43.43 -0.00% :white_check_mark:
vicuna-fastchat 1 176.99 170.64 3.73% :high_brightness:
whisper-tiny-encoder 1 418.28 418.21 0.02% :white_check_mark:
whisper-tiny-decoder 1 428.27 426.10 0.51% :white_check_mark:

Check results before merge :high_brightness:

migraphx-bot commented 3 weeks ago


     :white_check_mark: bert-mrpc-onnx: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert-mrpc-tf: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance
     :white_check_mark: pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance
     :white_check_mark: torchvision-resnet50_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-dpn92_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: cadene-resnext101_1: PASSED: MIGraphX meets tolerance
     :white_check_mark: dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance
     :white_check_mark: agentmodel: PASSED: MIGraphX meets tolerance
     :white_check_mark: unet: PASSED: MIGraphX meets tolerance
     :white_check_mark: resnet50v1: PASSED: MIGraphX meets tolerance
     :white_check_mark: bert_base_cased_fp16: PASSED: MIGraphX meets tolerance
:red_circle:bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

     :white_check_mark: bert_large: PASSED: MIGraphX meets tolerance
     :white_check_mark: yolov5s: PASSED: MIGraphX meets tolerance
     :white_check_mark: tinyllama: PASSED: MIGraphX meets tolerance
     :white_check_mark: vicuna-fastchat: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-encoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: whisper-tiny-decoder: PASSED: MIGraphX meets tolerance
     :white_check_mark: distilgpt2_fp16: PASSED: MIGraphX meets tolerance