gyulaz-htec commented 1 year ago

Latest results can be found here for FP32 and FP16

This is a list of MIGraphX errors from ONXX Model Zoo.

The main issues:

[x] concat: all input dimensions should match along axis 2 - e.g. shape1: {1, 256, 2, 2} and shape2: {1, 512, 1, 1}
[x] CONVOLUTION: mismatched channel numbers -only for opset 8 opset 9 versions of the models are working
[x] Assertion 'op->use_empty() && "expected 'op' to have no uses"' failed. - MLIR related error. Fixed in https://github.com/ROCmSoftwarePlatform/rocMLIR/pull/1282
[x] PARSE_GEMM: A and B should be rank 2, A is rank 4, B is rank 2 - onnx opset 3
[x] parse_inputs: module "Loop_26_loop" has parameter name "vconst_100" existing in parent graph! - GPT-2 with Beam Search Generation
[x] reshape: Wrong number of elements for reshape - PR
[x] scatter_none: Shapes are not in standard layout fixed in https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/pull/2314
[x] Unknown operator: QLinearConv - PR
[x] Unknown operator: QLinearLeakyRelu - microsoft op - PR
[x] Unknown operator: QLinearMatMul - PR
[x] Unknown operator: QLinearMul - microsoft op - PR
[x] Unknown operator: QLinearSigmoid - microsoft op -PR
[x] Unknown operator: QLinearAveragePool - DenseNet after fixing QlinearMul- microsoft op - PR
[x] Unknown operator: QLinearConcat - microsoft op - PR
[x] Unknown operator: DynamicQuantizeLinear - PR
[x] static_compute_shape: reshape_lazy on axis that is not packed - Fixed in https://github.com/ROCm/AMDMIGraphX/commit/2d4a6507c3ad41f9d7ea36de1d7fb257cc788585
[x] standard: topk: Shapes are not in standard layout - issue - The issue is gone in https://github.com/ROCm/AMDMIGraphX/commit/32290a013d47e42197dce7ca129ad4809d9cd99b
[ ] QLINEARMATMUL: unsupported row/column quantization - Issue
[ ] dyn_compute_shape: Reshape: Only supports one non-fixed dynamic_dimension - issue
[ ] get_type: Prototensor data type 8 not supported
[ ] get_type: Prototensor data type 0 not supported
[ ] Unknown operator: FusedMatMul - issue
[ ] miopen make_tensor: MAKE_TENSOR: unsupported type uint8_type - issue
[x] compute_shape: FUSED_CONCAT: Missing fused modules - Fixed in https://github.com/ROCm/AMDMIGraphX/pull/2682
[x] compute_shape: GET_TUPLE_ELEM: index 1 is out of range 0 - PR
[ ] normalize_compute_shape: CONVOLUTION: mismatched channel numbers

attila-dusnoki-htec commented 1 year ago

Yolov3 Concat issue

The model is converted and does not have default shapes. But the documentation actually states the requires shapes here Running with migraphx-driver compile yolov3-10.onnx --input-dim @input_1 1 3 416 416 --input-dim @image_shape 1 2 passes the wrong concat axes issue.

It will run into: what(): /code/AMDMIGraphX/src/include/migraphx/op/reshape_lazy.hpp:238: static_compute_shape: reshape_lazy on axis that is not packed. Disabling add_reshape_lazy_op(); in lowering.cpp results in a completely compiled model.

attila-dusnoki-htec commented 1 year ago

StyleTransfer Convolution issue

The problem is with Upsample: version-7 has scales as attribute, version-9 has scales as input. In MIGraphX, parse_resize is used for upsample, and it does not check for the attribute, only for the input.

Important to note that Upsample is deprecated.

Update 1) After parsing the attribute, the issue is still present. the older model (version-8) contains this literal for the gather: @0 = @literal{31342768} -> int32_type, {0, 0, 0, 0}, {0, 0, 0, 1}, target_id=0 instead of this which the newer model (version-9) has: @11 = @literal{ ... } -> int32_type, {1, 128, 112, 112}, {1605632, 12544, 112, 1}, target_id=0 Which leads to an shape mismatch for the convolution.

Update 2) I parsed the scales, but missed to update the output lens with them. Now the model compiles correctly.

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/142

attila-dusnoki-htec commented 1 year ago

GEMM issue

parse: PARSE_GEMM: A and B should be rank 2, A is rank 4, B is rank 2

These are from ShuffleNet:

shufflenet-3.onnx gemm-v1 shufflenet-6.onnx gemm-v6

The old model has the shape as 1x544x1x1 and the newer version has a reshape to 1x544. MIGraphX expect the shape to be 2D.

Note: both GEMM v1 and v6 are the same in this case, the model export changed probably.

ONNXRuntime only supports models with opset 7 and higher. This model does not compile with it.

attila-dusnoki-htec commented 1 year ago

GPT parse issue

https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/onnx/onnx_parser.cpp#L349-L357

As the above comment states, this is currently the expected behaviour in MIGraghX.

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/143

gyulaz-htec commented 1 year ago

QLinearMatMul

Created an issue for this: https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/issues/2337

attila-dusnoki-htec commented 1 year ago

Reshape isssue

static_compute_shape: reshape: Wrong number of elements for reshape: reshape has 18446744073709551054 elements whereas the input has 18446744073709551070

The problem is at loop with trip_count__47 = 9223372036854775807 (uint64_max). This will become the max_loop_iterations, which will be the batch size, and the whole thing explodes. Limiting it to a smaller number continues the parsing.

It will result in /code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:119: has: slice: Wrong number of arguments: expected 1, 3, 4 but given 2 for parsing slice.

Update 1) It fails at this node: arg0: squeeze[axes={0}](gather[axis=0]) -> float_type, {1917, 90}, {90, 1} arg1: @literal{0, 0} -> int32_type, {2}, {1} arg2: add(@literal{0, 0}, add) -> int32_type, {2}, {1}

where arg0 is operator: Squeeze Loop_1114_loop:@1312 = squeeze[axes={0}](Loop_1114_loop:@1311) -> float_type, {1917, 90}, {90, 1}, target_id=0

Tracking issue: https://github.com/migraphx-benchmark/AMDMIGraphX/issues/149

gyulaz-htec commented 1 year ago

reshape_lazy issue

static_compute_shape: reshape_lazy on axis that is not packed

This is caused by input strides can't be merged and squezed during reshape_lazy:

@536 = transpose[permutation={0, 3, 4, 1, 2}](@535) -> float_type, {1, 8, 8, 3, 4}, {768, 8, 1, 256, 64}, target_id=0
@537 = reshape[dims={1, 192, 4}](@536) -> float_type, {1, 192, 4}, {768, 256, 64}, target_id=0

gpu::contiguous was elliminated after the transpose, which doesn't produce a standard shape.

There is a workaround branch which keeps the gpu::contiguous for this corner case and makes the model compilation succesfull.

@621 = transpose[permutation={0, 3, 4, 1, 2}](@620) -> float_type, {1, 8, 8, 3, 4}, {768, 8, 1, 256, 64}, target_id=0
@622 = gpu::code_object[code_object=9544,symbol_name=contiguous_kernel,global=1024,local=1024,](@621,@619) -> float_type, {1, 8, 8, 3, 4}, {768, 96, 12, 4, 1}, target_id=0
@623 = reshape_lazy[dims={1, 192, 4}](@622) -> float_type, {1, 192, 4}, {768, 4, 1}, target_id=0

gyulaz-htec commented 12 months ago

standard: topk: Shapes are not in standard layout

Command: migraphx-driver compile /vision/object_detection_segmentation/mask-rcnn/model/MaskRCNN-12-int8.onnx --input-dim @image 3 1024 1024 failing shape: float_type, {1, 196608}, {196608, 65536}

gyulaz-htec commented 12 months ago

dyn_compute_shape: Reshape: Only supports one non-fixed dynamic_dimension

Command: migraphx-driver compile /vision/object_detection_segmentation/ssd-obilenetv1/model/ssd_mobilenet_v1_12.onnx --input-dim @inputs 1 800 800 3

attila-dusnoki-htec commented 9 months ago

After checking Computer Visison, here are the compile issues:

src/include/migraphx/check_shapes.hpp:253: standard: flatten: Shapes are not in standard layout
- convnext-base/nano/small/tiny/large/xlarge
- edgenext
- poolformer
src/include/migraphx/op/reshape.hpp:149: static_compute_shape: reshape: Wrong number of elements for reshape: reshape has 54448 elements whereas the input has 54450
- fasterrcnn_mobilenet_v3_large
src/include/migraphx/op/reshape_lazy.hpp:285: static_compute_shape: reshape_lazy on axis that is not packed.
- levit-128/192/256/384
src/onnx/parse_pad.cpp:182: parse_constant_value: PARSE_PAD: "value" should contain only one element
- botnet26t
- coat lite/mini/small/tiny
- eca botnext
- maskrcnn
- vit relpos
src/onnx/parse_resize.cpp:166: get_mode: PARSE_RESIZE: only nearest and linear modes are supported!
- crossvit
src/targets/gpu/compile_ops.cpp:153: benchmark: No configs to tune
- ese vovnet
- gluon senet/seresnext
- legacy senet/seresnet/resnext
src/include/migraphx/op/get_tuple_elem.hpp:56: compute_shape: GET_TUPLE_ELEM: index 0 is out of range 0
- fcos resnet50
src/include/migraphx/op/get_tuple_elem.hpp:56: compute_shape: GET_TUPLE_ELEM: index 1 is out of range 0
- fasterrcnn

migraphx-benchmark / AMDMIGraphX

ONNX Model Zoo models #141

Yolov3 Concat issue

StyleTransfer Convolution issue

GEMM issue

GPT parse issue

QLinearMatMul

Reshape isssue

reshape_lazy issue

standard: topk: Shapes are not in standard layout

dyn_compute_shape: Reshape: Only supports one non-fixed dynamic_dimension