I tested this after building on ROCm 6.1.2. The tests run wonderfully on my RX 7900 XTX:
eitch@eitchtower:~/src/compile_temp/pytorch-gpu-benchmark$ export CUDA_VISIBLE_DEVICES=0
eitch@eitchtower:~/src/compile_temp/pytorch-gpu-benchmark$ ./test.sh
AMD gpu benchmarks starting
GPU count: 1
hip_fatbin.cpp: COMGR API could not find the CO for this GPU device/ISA: amdgcn-amd-amdhsa--gfx1100
hip_fatbin.cpp: COMGR API could not find the CO for this GPU device/ISA: amdgcn-amd-amdhsa--gfx1100
[2024-06-24 21:46:25,050] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn is not compatible with ROCM
benchmark start : 2024/06/24 21:46:26
Number of GPUs on current device : 1
CUDA Version : None
Cudnn Version : 3001000
Device Name : AMD Radeon RX 7900 XTX
uname_result(system='Linux', node='eitchtower', release='6.9.3-060903-generic', version='#202405300957 SMP PREEMPT_DYNAMIC Thu May 30 11:39:13 UTC 2024', machine='x86_64')
scpufreq(current=4139.385343749999, min=400.0, max=5759.0)
cpu_count: 32
memory_available: 36787560448
Benchmarking Training float precision type mnasnet0_5
mnasnet0_5 model average train time : 24.533867835998535ms
Benchmarking Training float precision type mnasnet0_75
mnasnet0_75 model average train time : 27.247414588928223ms
Benchmarking Training float precision type mnasnet1_0
mnasnet1_0 model average train time : 30.473084449768066ms
Benchmarking Training float precision type mnasnet1_3
mnasnet1_3 model average train time : 37.2760009765625ms
Benchmarking Training float precision type resnet18
resnet18 model average train time : 12.391281127929688ms
Benchmarking Training float precision type resnet34
resnet34 model average train time : 17.9769229888916ms
Benchmarking Training float precision type resnet50
resnet50 model average train time : 33.61192226409912ms
Benchmarking Training float precision type resnet101
resnet101 model average train time : 51.551666259765625ms
Benchmarking Training float precision type resnet152
resnet152 model average train time : 70.84245681762695ms
Benchmarking Training float precision type resnext50_32x4d
resnext50_32x4d model average train time : 42.66497611999512ms
Benchmarking Training float precision type resnext101_32x8d
resnext101_32x8d model average train time : 101.69620990753174ms
Benchmarking Training float precision type resnext101_64x4d
resnext101_64x4d model average train time : 102.70112991333008ms
Benchmarking Training float precision type wide_resnet50_2
wide_resnet50_2 model average train time : 52.88909912109375ms
Benchmarking Training float precision type wide_resnet101_2
wide_resnet101_2 model average train time : 86.19845867156982ms
Benchmarking Training float precision type densenet121
densenet121 model average train time : 45.270280838012695ms
Benchmarking Training float precision type densenet161
densenet161 model average train time : 78.27514171600342ms
Benchmarking Training float precision type densenet169
densenet169 model average train time : 59.1752815246582ms
Benchmarking Training float precision type densenet201
densenet201 model average train time : 70.14707088470459ms
Benchmarking Training float precision type squeezenet1_0
squeezenet1_0 model average train time : 22.512826919555664ms
Benchmarking Training float precision type squeezenet1_1
squeezenet1_1 model average train time : 12.033023834228516ms
Benchmarking Training float precision type vgg11
vgg11 model average train time : 29.853954315185547ms
Benchmarking Training float precision type vgg11_bn
vgg11_bn model average train time : 33.177428245544434ms
Benchmarking Training float precision type vgg13
vgg13 model average train time : 48.360090255737305ms
Benchmarking Training float precision type vgg13_bn
vgg13_bn model average train time : 53.86557579040527ms
Benchmarking Training float precision type vgg16
vgg16 model average train time : 55.60842990875244ms
Benchmarking Training float precision type vgg16_bn
vgg16_bn model average train time : 61.618404388427734ms
Benchmarking Training float precision type vgg19
vgg19 model average train time : 63.11783790588379ms
Benchmarking Training float precision type vgg19_bn
vgg19_bn model average train time : 69.60336208343506ms
Benchmarking Training float precision type mobilenet_v3_large
mobilenet_v3_large model average train time : 28.548202514648438ms
Benchmarking Training float precision type mobilenet_v3_small
mobilenet_v3_small model average train time : 18.4964656829834ms
Benchmarking Training float precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average train time : 18.908543586730957ms
Benchmarking Training float precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average train time : 20.006709098815918ms
Benchmarking Training float precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average train time : 20.983824729919434ms
Benchmarking Training float precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average train time : 22.15290069580078ms
Benchmarking Inference float precision type mnasnet0_5
mnasnet0_5 model average inference time : 5.2114057540893555ms
Benchmarking Inference float precision type mnasnet0_75
mnasnet0_75 model average inference time : 6.540632247924805ms
Benchmarking Inference float precision type mnasnet1_0
mnasnet1_0 model average inference time : 7.575473785400391ms
Benchmarking Inference float precision type mnasnet1_3
mnasnet1_3 model average inference time : 9.477519989013672ms
Benchmarking Inference float precision type resnet18
resnet18 model average inference time : 4.3056440353393555ms
Benchmarking Inference float precision type resnet34
resnet34 model average inference time : 5.031719207763672ms
Benchmarking Inference float precision type resnet50
resnet50 model average inference time : 8.668785095214844ms
Benchmarking Inference float precision type resnet101
resnet101 model average inference time : 13.600969314575195ms
Benchmarking Inference float precision type resnet152
resnet152 model average inference time : 18.800125122070312ms
Benchmarking Inference float precision type resnext50_32x4d
resnext50_32x4d model average inference time : 11.131224632263184ms
Benchmarking Inference float precision type resnext101_32x8d
resnext101_32x8d model average inference time : 27.71352767944336ms
Benchmarking Inference float precision type resnext101_64x4d
resnext101_64x4d model average inference time : 28.522496223449707ms
Benchmarking Inference float precision type wide_resnet50_2
wide_resnet50_2 model average inference time : 14.661903381347656ms
Benchmarking Inference float precision type wide_resnet101_2
wide_resnet101_2 model average inference time : 24.688773155212402ms
Benchmarking Inference float precision type densenet121
densenet121 model average inference time : 11.288576126098633ms
Benchmarking Inference float precision type densenet161
densenet161 model average inference time : 23.62297534942627ms
Benchmarking Inference float precision type densenet169
densenet169 model average inference time : 15.222225189208984ms
Benchmarking Inference float precision type densenet201
densenet201 model average inference time : 20.073328018188477ms
Benchmarking Inference float precision type squeezenet1_0
squeezenet1_0 model average inference time : 4.4448137283325195ms
Benchmarking Inference float precision type squeezenet1_1
squeezenet1_1 model average inference time : 4.243755340576172ms
Benchmarking Inference float precision type vgg11
vgg11 model average inference time : 8.276267051696777ms
Benchmarking Inference float precision type vgg11_bn
vgg11_bn model average inference time : 9.103899002075195ms
Benchmarking Inference float precision type vgg13
vgg13 model average inference time : 10.58922290802002ms
Benchmarking Inference float precision type vgg13_bn
vgg13_bn model average inference time : 12.26208209991455ms
Benchmarking Inference float precision type vgg16
vgg16 model average inference time : 12.456426620483398ms
Benchmarking Inference float precision type vgg16_bn
vgg16_bn model average inference time : 14.116816520690918ms
Benchmarking Inference float precision type vgg19
vgg19 model average inference time : 14.295086860656738ms
Benchmarking Inference float precision type vgg19_bn
vgg19_bn model average inference time : 15.971732139587402ms
Benchmarking Inference float precision type mobilenet_v3_large
mobilenet_v3_large model average inference time : 6.626443862915039ms
Benchmarking Inference float precision type mobilenet_v3_small
mobilenet_v3_small model average inference time : 5.263700485229492ms
Benchmarking Inference float precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average inference time : 6.213779449462891ms
Benchmarking Inference float precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average inference time : 6.076226234436035ms
Benchmarking Inference float precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average inference time : 6.232490539550781ms
Benchmarking Inference float precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average inference time : 6.553955078125ms
Benchmarking Training half precision type mnasnet0_5
mnasnet0_5 model average train time : 24.50690269470215ms
Benchmarking Training half precision type mnasnet0_75
mnasnet0_75 model average train time : 26.351070404052734ms
Benchmarking Training half precision type mnasnet1_0
mnasnet1_0 model average train time : 27.649965286254883ms
Benchmarking Training half precision type mnasnet1_3
mnasnet1_3 model average train time : 31.68992042541504ms
Benchmarking Training half precision type resnet18
resnet18 model average train time : 9.666085243225098ms
Benchmarking Training half precision type resnet34
resnet34 model average train time : 14.544730186462402ms
Benchmarking Training half precision type resnet50
resnet50 model average train time : 25.299043655395508ms
Benchmarking Training half precision type resnet101
resnet101 model average train time : 39.463138580322266ms
Benchmarking Training half precision type resnet152
resnet152 model average train time : 53.73621463775635ms
Benchmarking Training half precision type resnext50_32x4d
resnext50_32x4d model average train time : 24.766650199890137ms
Benchmarking Training half precision type resnext101_32x8d
resnext101_32x8d model average train time : 53.93174648284912ms
Benchmarking Training half precision type resnext101_64x4d
resnext101_64x4d model average train time : 55.17936706542969ms
Benchmarking Training half precision type wide_resnet50_2
wide_resnet50_2 model average train time : 32.47838020324707ms
Benchmarking Training half precision type wide_resnet101_2
wide_resnet101_2 model average train time : 52.752790451049805ms
Benchmarking Training half precision type densenet121
densenet121 model average train time : 45.07422924041748ms
Benchmarking Training half precision type densenet161
densenet161 model average train time : 57.78000354766846ms
Benchmarking Training half precision type densenet169
densenet169 model average train time : 59.63819980621338ms
Benchmarking Training half precision type densenet201
densenet201 model average train time : 71.99620723724365ms
Benchmarking Training half precision type squeezenet1_0
squeezenet1_0 model average train time : 10.245232582092285ms
Benchmarking Training half precision type squeezenet1_1
squeezenet1_1 model average train time : 10.709238052368164ms
Benchmarking Training half precision type vgg11
vgg11 model average train time : 17.7616024017334ms
Benchmarking Training half precision type vgg11_bn
vgg11_bn model average train time : 21.45728588104248ms
Benchmarking Training half precision type vgg13
vgg13 model average train time : 28.328700065612793ms
Benchmarking Training half precision type vgg13_bn
vgg13_bn model average train time : 34.90305423736572ms
Benchmarking Training half precision type vgg16
vgg16 model average train time : 32.26703643798828ms
Benchmarking Training half precision type vgg16_bn
vgg16_bn model average train time : 39.304375648498535ms
Benchmarking Training half precision type vgg19
vgg19 model average train time : 36.43730640411377ms
Benchmarking Training half precision type vgg19_bn
vgg19_bn model average train time : 43.86319160461426ms
Benchmarking Training half precision type mobilenet_v3_large
mobilenet_v3_large model average train time : 28.963427543640137ms
Benchmarking Training half precision type mobilenet_v3_small
mobilenet_v3_small model average train time : 19.338665008544922ms
Benchmarking Training half precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average train time : 19.24680233001709ms
Benchmarking Training half precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average train time : 24.048876762390137ms
Benchmarking Training half precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average train time : 20.940065383911133ms
Benchmarking Training half precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average train time : 22.82121181488037ms
Benchmarking Inference half precision type mnasnet0_5
mnasnet0_5 model average inference time : 4.995217323303223ms
Benchmarking Inference half precision type mnasnet0_75
mnasnet0_75 model average inference time : 5.409712791442871ms
Benchmarking Inference half precision type mnasnet1_0
mnasnet1_0 model average inference time : 6.281166076660156ms
Benchmarking Inference half precision type mnasnet1_3
mnasnet1_3 model average inference time : 8.172283172607422ms
Benchmarking Inference half precision type resnet18
resnet18 model average inference time : 3.6522817611694336ms
Benchmarking Inference half precision type resnet34
resnet34 model average inference time : 5.353398323059082ms
Benchmarking Inference half precision type resnet50
resnet50 model average inference time : 6.012873649597168ms
Benchmarking Inference half precision type resnet101
resnet101 model average inference time : 9.429936408996582ms
Benchmarking Inference half precision type resnet152
resnet152 model average inference time : 15.310559272766113ms
Benchmarking Inference half precision type resnext50_32x4d
resnext50_32x4d model average inference time : 7.742910385131836ms
Benchmarking Inference half precision type resnext101_32x8d
resnext101_32x8d model average inference time : 15.338869094848633ms
Benchmarking Inference half precision type resnext101_64x4d
resnext101_64x4d model average inference time : 16.386327743530273ms
Benchmarking Inference half precision type wide_resnet50_2
wide_resnet50_2 model average inference time : 8.10272216796875ms
Benchmarking Inference half precision type wide_resnet101_2
wide_resnet101_2 model average inference time : 15.152497291564941ms
Benchmarking Inference half precision type densenet121
densenet121 model average inference time : 12.424025535583496ms
Benchmarking Inference half precision type densenet161
densenet161 model average inference time : 18.78988742828369ms
Benchmarking Inference half precision type densenet169
densenet169 model average inference time : 16.126089096069336ms
Benchmarking Inference half precision type densenet201
densenet201 model average inference time : 18.758788108825684ms
Benchmarking Inference half precision type squeezenet1_0
squeezenet1_0 model average inference time : 4.37924861907959ms
Benchmarking Inference half precision type squeezenet1_1
squeezenet1_1 model average inference time : 3.2996225357055664ms
Benchmarking Inference half precision type vgg11
vgg11 model average inference time : 5.677704811096191ms
Benchmarking Inference half precision type vgg11_bn
vgg11_bn model average inference time : 4.802088737487793ms
Benchmarking Inference half precision type vgg13
vgg13 model average inference time : 5.681405067443848ms
Benchmarking Inference half precision type vgg13_bn
vgg13_bn model average inference time : 6.407041549682617ms
Benchmarking Inference half precision type vgg16
vgg16 model average inference time : 6.921801567077637ms
Benchmarking Inference half precision type vgg16_bn
vgg16_bn model average inference time : 7.687211036682129ms
Benchmarking Inference half precision type vgg19
vgg19 model average inference time : 8.11396598815918ms
Benchmarking Inference half precision type vgg19_bn
vgg19_bn model average inference time : 8.993682861328125ms
Benchmarking Inference half precision type mobilenet_v3_large
mobilenet_v3_large model average inference time : 6.083874702453613ms
Benchmarking Inference half precision type mobilenet_v3_small
mobilenet_v3_small model average inference time : 5.050396919250488ms
Benchmarking Inference half precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average inference time : 7.064704895019531ms
Benchmarking Inference half precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average inference time : 6.381711959838867ms
Benchmarking Inference half precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average inference time : 5.966534614562988ms
Benchmarking Inference half precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average inference time : 6.6403961181640625ms
Benchmarking Training double precision type mnasnet0_5
mnasnet0_5 model average train time : 142.28407382965088ms
Benchmarking Training double precision type mnasnet0_75
mnasnet0_75 model average train time : 159.4767141342163ms
Benchmarking Training double precision type mnasnet1_0
mnasnet1_0 model average train time : 174.5019769668579ms
Benchmarking Training double precision type mnasnet1_3
mnasnet1_3 model average train time : 198.3668088912964ms
Benchmarking Training double precision type resnet18
resnet18 model average train time : 228.55446815490723ms
Benchmarking Training double precision type resnet34
resnet34 model average train time : 442.7056550979614ms
Benchmarking Training double precision type resnet50
resnet50 model average train time : 535.2924394607544ms
Benchmarking Training double precision type resnet101
resnet101 model average train time : 999.907341003418ms
Benchmarking Training double precision type resnet152
resnet152 model average train time : 1457.532353401184ms
Benchmarking Training double precision type resnext50_32x4d
resnext50_32x4d model average train time : 1505.1169776916504ms
Benchmarking Training double precision type resnext101_32x8d
resnext101_32x8d model average train time : 3316.7404174804688ms
Benchmarking Training double precision type resnext101_64x4d
resnext101_64x4d model average train time : 4551.7040729522705ms
Benchmarking Training double precision type wide_resnet50_2
wide_resnet50_2 model average train time : 1192.6723003387451ms
Benchmarking Training double precision type wide_resnet101_2
wide_resnet101_2 model average train time : 2344.0072059631348ms
Benchmarking Training double precision type densenet121
densenet121 model average train time : 534.3494749069214ms
Benchmarking Training double precision type densenet161
densenet161 model average train time : 1218.3501815795898ms
Benchmarking Training double precision type densenet169
densenet169 model average train time : 703.4244537353516ms
Benchmarking Training double precision type densenet201
densenet201 model average train time : 884.4409513473511ms
Benchmarking Training double precision type squeezenet1_0
squeezenet1_0 model average train time : 140.78056812286377ms
Benchmarking Training double precision type squeezenet1_1
squeezenet1_1 model average train time : 98.9573621749878ms
Benchmarking Training double precision type vgg11
vgg11 model average train time : 739.9577236175537ms
Benchmarking Training double precision type vgg11_bn
vgg11_bn model average train time : 752.3213958740234ms
Benchmarking Training double precision type vgg13
vgg13 model average train time : 1079.9136638641357ms
Benchmarking Training double precision type vgg13_bn
vgg13_bn model average train time : 1100.167212486267ms
Benchmarking Training double precision type vgg16
vgg16 model average train time : 1426.78795337677ms
Benchmarking Training double precision type vgg16_bn
vgg16_bn model average train time : 1449.358434677124ms
Benchmarking Training double precision type vgg19
vgg19 model average train time : 1773.5577726364136ms
Benchmarking Training double precision type vgg19_bn
vgg19_bn model average train time : 1798.6316680908203ms
Benchmarking Training double precision type mobilenet_v3_large
mobilenet_v3_large model average train time : 166.41663074493408ms
Benchmarking Training double precision type mobilenet_v3_small
mobilenet_v3_small model average train time : 77.64487743377686ms
Benchmarking Training double precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average train time : 65.75040817260742ms
Benchmarking Training double precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average train time : 81.11960887908936ms
Benchmarking Training double precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average train time : 94.46946620941162ms
Benchmarking Training double precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average train time : 136.83892250061035ms
Benchmarking Inference double precision type mnasnet0_5
mnasnet0_5 model average inference time : 15.575079917907715ms
Benchmarking Inference double precision type mnasnet0_75
mnasnet0_75 model average inference time : 22.316560745239258ms
Benchmarking Inference double precision type mnasnet1_0
mnasnet1_0 model average inference time : 27.872934341430664ms
Benchmarking Inference double precision type mnasnet1_3
mnasnet1_3 model average inference time : 36.5085506439209ms
Benchmarking Inference double precision type resnet18
resnet18 model average inference time : 88.40462684631348ms
Benchmarking Inference double precision type resnet34
resnet34 model average inference time : 182.20274925231934ms
Benchmarking Inference double precision type resnet50
resnet50 model average inference time : 185.62331676483154ms
Benchmarking Inference double precision type resnet101
resnet101 model average inference time : 376.9978427886963ms
Benchmarking Inference double precision type resnet152
resnet152 model average inference time : 558.89732837677ms
Benchmarking Inference double precision type resnext50_32x4d
resnext50_32x4d model average inference time : 319.150128364563ms
Benchmarking Inference double precision type resnext101_32x8d
resnext101_32x8d model average inference time : 1084.0533113479614ms
Benchmarking Inference double precision type resnext101_64x4d
resnext101_64x4d model average inference time : 1202.0014142990112ms
Benchmarking Inference double precision type wide_resnet50_2
wide_resnet50_2 model average inference time : 438.0705261230469ms
Benchmarking Inference double precision type wide_resnet101_2
wide_resnet101_2 model average inference time : 887.296257019043ms
Benchmarking Inference double precision type densenet121
densenet121 model average inference time : 230.21681308746338ms
Benchmarking Inference double precision type densenet161
densenet161 model average inference time : 535.5282545089722ms
Benchmarking Inference double precision type densenet169
densenet169 model average inference time : 335.4950523376465ms
Benchmarking Inference double precision type densenet201
densenet201 model average inference time : 428.5811185836792ms
Benchmarking Inference double precision type squeezenet1_0
squeezenet1_0 model average inference time : 31.123065948486328ms
Benchmarking Inference double precision type squeezenet1_1
squeezenet1_1 model average inference time : 18.499469757080078ms
Benchmarking Inference double precision type vgg11
vgg11 model average inference time : 236.25004768371582ms
Benchmarking Inference double precision type vgg11_bn
vgg11_bn model average inference time : 238.0146026611328ms
Benchmarking Inference double precision type vgg13
vgg13 model average inference time : 338.84942531585693ms
Benchmarking Inference double precision type vgg13_bn
vgg13_bn model average inference time : 341.84242725372314ms
Benchmarking Inference double precision type vgg16
vgg16 model average inference time : 461.97062492370605ms
Benchmarking Inference double precision type vgg16_bn
vgg16_bn model average inference time : 465.16629695892334ms
Benchmarking Inference double precision type vgg19
vgg19 model average inference time : 585.0282430648804ms
Benchmarking Inference double precision type vgg19_bn
vgg19_bn model average inference time : 588.3558559417725ms
Benchmarking Inference double precision type mobilenet_v3_large
mobilenet_v3_large model average inference time : 29.41577434539795ms
Benchmarking Inference double precision type mobilenet_v3_small
mobilenet_v3_small model average inference time : 15.864949226379395ms
Benchmarking Inference double precision type shufflenet_v2_x0_5
shufflenet_v2_x0_5 model average inference time : 9.836525917053223ms
Benchmarking Inference double precision type shufflenet_v2_x1_0
shufflenet_v2_x1_0 model average inference time : 15.541191101074219ms
Benchmarking Inference double precision type shufflenet_v2_x1_5
shufflenet_v2_x1_5 model average inference time : 21.227006912231445ms
Benchmarking Inference double precision type shufflenet_v2_x2_0
shufflenet_v2_x2_0 model average inference time : 35.22214889526367ms
benchmark end : 2024/06/24 22:34:46
AMD GPU benchmarks finished
I tested this after building on ROCm 6.1.2. The tests run wonderfully on my RX 7900 XTX: