migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

ONNX Model Zoo models #141

Open gyulaz-htec opened 1 year ago

gyulaz-htec commented 1 year ago

Latest results can be found here for FP32 and FP16

This is a list of MIGraphX errors from ONXX Model Zoo.

The main issues:

attila-dusnoki-htec commented 1 year ago

Yolov3 Concat issue

The model is converted and does not have default shapes. But the documentation actually states the requires shapes here Running with migraphx-driver compile yolov3-10.onnx --input-dim @input_1 1 3 416 416 --input-dim @image_shape 1 2 passes the wrong concat axes issue.

It will run into: what(): /code/AMDMIGraphX/src/include/migraphx/op/reshape_lazy.hpp:238: static_compute_shape: reshape_lazy on axis that is not packed. Disabling add_reshape_lazy_op(); in lowering.cpp results in a completely compiled model.

attila-dusnoki-htec commented 1 year ago

StyleTransfer Convolution issue

The problem is with Upsample: version-7 has scales as attribute, version-9 has scales as input. In MIGraphX, parse_resize is used for upsample, and it does not check for the attribute, only for the input.

Important to note that Upsample is deprecated.

Update 1) After parsing the attribute, the issue is still present. the older model (version-8) contains this literal for the gather: @0 = @literal{31342768} -> int32_type, {0, 0, 0, 0}, {0, 0, 0, 1}, target_id=0 instead of this which the newer model (version-9) has: @11 = @literal{ ... } -> int32_type, {1, 128, 112, 112}, {1605632, 12544, 112, 1}, target_id=0 Which leads to an shape mismatch for the convolution.

Update 2) I parsed the scales, but missed to update the output lens with them. Now the model compiles correctly.

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/142

attila-dusnoki-htec commented 1 year ago

GEMM issue

parse: PARSE_GEMM: A and B should be rank 2, A is rank 4, B is rank 2

These are from ShuffleNet:

shufflenet-3.onnx gemm-v1 shufflenet-6.onnx gemm-v6

The old model has the shape as 1x544x1x1 and the newer version has a reshape to 1x544. MIGraphX expect the shape to be 2D.

Note: both GEMM v1 and v6 are the same in this case, the model export changed probably.

ONNXRuntime only supports models with opset 7 and higher. This model does not compile with it.

attila-dusnoki-htec commented 1 year ago

GPT parse issue

https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/onnx/onnx_parser.cpp#L349-L357

As the above comment states, this is currently the expected behaviour in MIGraghX.

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/143

gyulaz-htec commented 1 year ago

QLinearMatMul

Created an issue for this: https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/issues/2337

attila-dusnoki-htec commented 1 year ago

Reshape isssue

static_compute_shape: reshape: Wrong number of elements for reshape: reshape has 18446744073709551054 elements whereas the input has 18446744073709551070

The problem is at loop with trip_count__47 = 9223372036854775807 (uint64_max). This will become the max_loop_iterations, which will be the batch size, and the whole thing explodes. Limiting it to a smaller number continues the parsing.

It will result in /code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:119: has: slice: Wrong number of arguments: expected 1, 3, 4 but given 2 for parsing slice.

Update 1) It fails at this node: arg0: squeeze[axes={0}](gather[axis=0]) -> float_type, {1917, 90}, {90, 1} arg1: @literal{0, 0} -> int32_type, {2}, {1} arg2: add(@literal{0, 0}, add) -> int32_type, {2}, {1}

where arg0 is operator: Squeeze Loop_1114_loop:@1312 = squeeze[axes={0}](Loop_1114_loop:@1311) -> float_type, {1917, 90}, {90, 1}, target_id=0

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/149

gyulaz-htec commented 1 year ago

reshape_lazy issue

static_compute_shape: reshape_lazy on axis that is not packed

This is caused by input strides can't be merged and squezed during reshape_lazy:

@536 = transpose[permutation={0, 3, 4, 1, 2}](@535) -> float_type, {1, 8, 8, 3, 4}, {768, 8, 1, 256, 64}, target_id=0
@537 = reshape[dims={1, 192, 4}](@536) -> float_type, {1, 192, 4}, {768, 256, 64}, target_id=0

gpu::contiguous was elliminated after the transpose, which doesn't produce a standard shape.

There is a workaround branch which keeps the gpu::contiguous for this corner case and makes the model compilation succesfull.

@621 = transpose[permutation={0, 3, 4, 1, 2}](@620) -> float_type, {1, 8, 8, 3, 4}, {768, 8, 1, 256, 64}, target_id=0
@622 = gpu::code_object[code_object=9544,symbol_name=contiguous_kernel,global=1024,local=1024,](@621,@619) -> float_type, {1, 8, 8, 3, 4}, {768, 96, 12, 4, 1}, target_id=0
@623 = reshape_lazy[dims={1, 192, 4}](@622) -> float_type, {1, 192, 4}, {768, 4, 1}, target_id=0
gyulaz-htec commented 12 months ago

standard: topk: Shapes are not in standard layout

Command: migraphx-driver compile /vision/object_detection_segmentation/mask-rcnn/model/MaskRCNN-12-int8.onnx --input-dim @image 3 1024 1024 failing shape: float_type, {1, 196608}, {196608, 65536}​

gyulaz-htec commented 12 months ago

dyn_compute_shape: Reshape: Only supports one non-fixed dynamic_dimension

Command: migraphx-driver compile /vision/object_detection_segmentation/ssd-obilenetv1/model/ssd_mobilenet_v1_12.onnx --input-dim @inputs 1 800 800 3​

attila-dusnoki-htec commented 9 months ago

After checking Computer Visison, here are the compile issues: