[tracking] OnnxToLinalg Op Support

vivekkhandelwal1 commented 6 months ago

Below is the list of ops for which OnnxToTorch lowering exists but it fails during the TorchToLinalg lowering. Out of the 281 tests only 25 are failing.

These ops do lower to the Torch but fail during the TorchToLinalg lowering. To reproduce the error take the respective lit test for the corresponding op from the test file and try to lower that to linalg separately, you will see the error.

To fix the issue, you need to either modify the OnnxToTorch lowering of the corresponding op or add the missing support in the TorchToLinalg lowering.

Op | Original Author

High-priority Ops:

[ ] Pad | Gaurav
[x] #452
[x] #451
[x] #461
[x] #466
[x] #467
[x] #580
[x] #579
[x] #581
[x] #582
[ ] #584
[ ] MatMulInteger | Rob

Low-priority Ops:

[ ] #465
[ ] ReduceMax | Alexander
[x] #653
[ ] ReduceMin | Sai
[ ] Bernoulli | Vivek
[ ] TopK | Chi
[ ] MaxPool | Chi
[ ] QLinearConv | Rob
[ ] QLinearMatMul | Rob
[ ] Scatter | Sai
[ ] Split | Xida
[ ] Tile | Phaneesh
[ ] GridSample #649 | Andreas

kumardeepakamd commented 6 months ago

@vivekkhandelwal1 and @rsuderman , VAI ML team ran VISION CNN designs through Shark FE: List of failures are:

|count|issue|
|------|------|
|15 | failed to legalize operation torch.aten.clamp.Tensor see current operation: %SSA = "torch.aten.clamp.Tensor"(%SSA|
|10 | failed to legalize operation ONNX Torch to torch: onnx.AveragePool|
|2 | failed to legalize operation ONNX Torch to torch: onnx.Transpose|
|2 | failed to legalize operation ONNX Torch to torch: onnx.Pad|
|2 | operand and result must have the same size and dtype torch_c.from_builtin_tensor(%SSA) : (tensor<?x?x8400xf32>) -> !torch.vtensor<[dims]|
|1 | NameError: name 'traceback' is not defined|
|1 | failed to legalize operation torch.aten.index_select see current operation: %SSA = "torch.aten.index_select"(%SSA}
|1 | operand and result must have the same size and dtype torch_c.from_builtin_tensor(%SSA) : (tensor<?x3x?x?xf32>) -> !torch.vtensor<[dims]|

I will make these tests available, but it may be more productive to look for failing operator level tests amongst the three operator suites and fix those first.

jinchen62 commented 5 months ago

@vivekkhandelwal1 According to the triage of iree onnx tests failures, below is the list of additional ops for which OnnxToTorch lowering exists but it fails during the TorchToLinalg lowering on the following tests. Please add them to the above list.

To reproduce issue: build venv following https://github.com/nod-ai/SHARK-TestSuite/tree/main/iree_tests#common-venv-setup-with-deps run iree-compile iree_tests/onnx/node/generated/TEST_NAME/model.mlir -o test.vmfb --iree-hal-target-backends=llvm-cpu --mlir-print-ir-after-all

[ ] Cast
- test_cast_STRING_to_FLOAT
[ ] CastLike
- test_castlike_STRING_to_FLOAT
- test_castlike_STRING_to_FLOAT_expanded
[ ] Dropout
- test_training_dropout
- test_training_dropout_default
- test_training_dropout_default_mask
- test_training_dropout_mask
- test_training_dropout_zero_ratio
- test_training_dropout_zero_ratio_mask
[ ] Equal
- test_equal_string
- test_equal_string_broadcast
[ ] GatherND
- test_gathernd_example_float32
- test_gathernd_example_int32_batch_dim1
[ ] Pow
- test_pow_types_int32_float32
- test_pow_types_int32_int32
- test_pow_types_int64_float32
- test_pow_types_int64_int64
[ ] ReduceProd
- test_reduce_prod_do_not_keepdims_example
- test_reduce_prod_do_not_keepdims_random
- test_reduce_prod_empty_set
- test_reduce_prod_keepdims_example
- test_reduce_prod_keepdims_random
- test_reduce_prod_negative_axes_keepdims_example
- test_reduce_prod_negative_axes_keepdims_random
[ ] Reshape
- test_group_normalization_epsilon_expanded
- test_group_normalization_example_expanded
- test_reshape_allowzero_reordered
[ ] Trilu
- test_triu
- test_triu_neg
- test_triu_one_row
- test_triu_out_neg_out
- test_triu_out_pos
- test_triu_pos
- test_triu_square
- test_triu_square_neg
- test_triu_zero

renxida commented 4 months ago

Overall Summary:
Successes: 280
Failures: 46
Error Reasons (sorted by frequency):
error: failed to legalize operation 'torch.aten.sum.dim_IntList' that was explicitly marked illegal: 6
        test_reduce_sum_do_not_keepdims_example
        test_reduce_sum_keepdims_example
        test_reduce_sum_negative_axes_keepdims_example
        test_reduce_l1_do_not_keepdims_example
        test_reduce_l1_keep_dims_example
        test_einsum_sum
error: failed to legalize operation 'torch.operator': 4
        test_random_normal
        test_random_uniform
        test_random_normal_like
        test_random_uniform_like
error: failed to legalize operation 'torch.aten.unsqueeze' that was explicitly marked illegal: 4
        test_unsqueeze_axis_2
        test_unsqueeze_axis_1
        test_unsqueeze_axis_0
        test_unsqueeze_three_axes
error: failed to legalize operation 'torch.constant.bool': 4
        test_reduce_max_bool_inputs_nokeepdims
        test_reduce_max_bool_inputs
        test_reduce_min_bool_inputs
        test_reduce_min_bool_inputs_nokeepdims
error: failed to legalize operation 'torch.aten.nonzero': 4
        test_nonzero
        test_compress
        test_compress_default_axis
        test_compress_neg_axis
error: failed to legalize operation 'torch.aten.arange.start_step' that was explicitly marked illegal: 4
        test_triu_one_row
        test_triu
        test_triu_zero
        test_triu_square
error: failed to legalize operation 'torch.prim.ListConstruct': 3
        test_squeeze_five_axes
        test_squeeze_two_axes
        test_squeeze
error: failed to legalize operation 'torch.constant.int': 3
        test_averagepool_3d_default
        test_scatter_elements_with_reduction_mul
        test_scatter_elements_with_duplicate_indices
error: failed to legalize operation 'torch.aten.slice.Tensor' that was explicitly marked illegal: 2
        test_slice_default_steps
        test_slice
error: failed to legalize operation 'torch.operator' that was explicitly marked illegal: 1
        test_squeeze_no_axes
No error; exit code: -8: 1
        test_logsoftmax_old_axis_1_dynamic_dim
error: 'linalg.generic' op inferred input/output operand #1 has shape's dimension #1 to be 7, but found 1: 1
        test_qlinearconv_bias
error: 'tensor.cast' op operand type 'tensor<1x64x115x113xf32>' and result type 'tensor<1x64x114x114xf32>' are cast incompatible: 1
        test_maxpool_pad
error: 'tensor.cast' op operand type 'tensor<?xi64>' and result type 'tensor<i64>' are cast incompatible: 1
        test_eyelike_dynamic
error: failed to legalize operation 'torch.aten.sgn': 1
        test_sign
error: failed to legalize operation 'torch.aten.prod.dim_int' that was explicitly marked illegal: 1
        test_reduce_prod_keepdims_random
error: failed to legalize operation 'torch.aten.permute' that was explicitly marked illegal: 1
        test_einsum_batch_matmul
error: type of return operand 0 ('!torch.vtensor<[2,?],f32>') doesn't match function result type ('!torch.vtensor<[2,2],f32>') in function @test_split_variable_parts_2d_opset18: 1
        test_split_variable_parts_2d_opset18
error: failed to legalize operation 'torch.aten.dropout' that was explicitly marked illegal: 1
        test_training_dropout_zero_ratio
error: failed to legalize operation 'torch.aten.lt.Tensor' that was explicitly marked illegal: 1
        test_bernoulli_double
error: 'torch.aten.permute' op expected input and output tensors to have same rank, but 3 != 2.: 1
        test_einsum_batch_diagonal

vivekkhandelwal1 commented 1 month ago

This issue has become stale since we use https://github.com/nod-ai/SHARK-Turbine/issues/797 for tracking the ops failing during Torch->Linalg lowering. This issue was created to add the support for failing lit tests. Hence, closing this issue now.

nod-ai / SHARK-Turbine

[tracking] OnnxToLinalg Op Support #450