migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

onnx backend reduce test errors #179

Open attila-dusnoki-htec opened 4 months ago

attila-dusnoki-htec commented 4 months ago

The following tests fails with onnx backend on the latest develop:

attila-dusnoki-htec commented 4 months ago

To reproduce: python onnx_backend_test.py -t test_reduce

The ful log is attached: onnx_test_reduce_latest.log

music-dino commented 3 months ago

Several issues are at play.

  1. Any do_not_keepdims tests must stay disabled, due to squeeze not supporting axes as a variable input. Adding support for this to squeeze itself doesn't seem too difficult, but updating all other operators to account for the possibility of a missing dynamic axes would be arduous and probably not very pretty.

  2. Reduce ops need to be offloaded to the CPU when axes is a variable input.

  3. The onnx node tests for reduce ops have axes as an input of variable length. MGX onnx_backend_test has no facilities to forward either a default fixed dimension value or a dynamic dimension value, which means that these graphs get parsed with the default dimension value, which is {1}. For some tests this is O.K. as the onnx test indeed passes a value of dimension {1}, for others it is not, as a value of dimension {0} is passed. It follows that the tests that do actually work are a happy accident as the default dimension corresponds to what the test does.

  4. Some situations where the reduce op returns a dynamic shape are hampered by the fact that pointwise ops do not have support for dynamic shapes. This applies to reduce_log_sum, reduce_log_sum_exp, reduce_l2, and all reduce operators if they're used in a where, which will happen if axes is a dynamic variable input.

  5. The rewrite reduce mean pass causes an error when reduce mean returns a dynamic shape. The pass should ignore the reduce mean if it returns a dynamic shape.

  6. Negative axes tests will error out do to a bug in the ref operator implementation, the axes should be normalized after they're extracted from the axes arg.